Building Life's Blocks from Scratch

The Quest to Create Novel Proteins

In the quest to design life's molecular machinery from the ground up, scientists are now engineering proteins unlike anything found in nature.

Explore the Science

Imagine being able to design custom proteins—microscopic machines that could target diseases with precision, break down environmental toxins, or build new materials molecule by molecule. This isn't science fiction; it's the cutting edge of scientific research happening in laboratories today.

For decades, scientists studied proteins that already existed in nature. Now, they're learning to build entirely new ones from scratch, creating custom molecular machines with functions we design. This revolutionary approach could transform medicine, environmental science, and technology, offering solutions to some of humanity's most pressing challenges.

The Blueprint of Life: Understanding Proteins

Proteins are the workhorses of all living organisms, performing countless essential functions.

What Are Proteins and Why Do They Matter?

These molecular machines provide cellular structure, speed up chemical reactions as enzymes, transport materials throughout the body, and regulate biological processes.

The process of protein synthesis, called translation, occurs in three phases: initiation, elongation, and termination8 . During initiation, ribosomal subunits identify the start codon on the mRNA. Elongation involves the sequential addition of amino acids, guided by codon-anticodon pairing. Finally, termination occurs when the ribosome encounters a stop codon, releasing the newly formed protein8 .

The Protein Folding Problem

A protein's function depends entirely on its three-dimensional shape, which is determined by its amino acid sequence. This sequence folds into a specific structure, often described as having primary, secondary, tertiary, and quaternary levels of organization. For decades, scientists have been trying to crack the code of how a linear chain of amino acids folds into its functional shape—a challenge known as the "protein folding problem."

Recent advances in artificial intelligence, like DeepMind's AlphaFold, have dramatically improved our ability to predict protein structures. But predicting the structure of a protein that already exists is one thing—designing entirely new sequences that fold into stable, functional shapes is an entirely different challenge.

Breaking New Ground: A Workflow for Novel Protein Discovery

The Proteogenomics Approach

Scientists have developed sophisticated workflows to identify and validate proteins that conventional methods might miss. One breakthrough approach, published in BMC Bioinformatics, uses proteogenomics—a method that combines proteomics and genomics to discover novel proteins1 .

This technique is particularly valuable for identifying small proteins (sProteins), which are often overlooked in traditional genome annotations because they don't meet length requirements or resemble known proteins1 . These small proteins, some as short as 37 amino acids, perform critical biological functions despite their size, such as regulating energy production or controlling glucose uptake1 .

From Spectra to Sequences: The Identification Process

The proteogenomic workflow maps peptide-spectrum matches (PSMs) directly to genomic locations, bypassing annotation biases1 . Here's how it works:

1

Mass Spectrometry Data Collection

Mass spectrometry data is collected from protein samples to analyze their composition and structure.

2

Peptide-Spectrum Mapping

Peptide-spectrum matches are mapped to all possible genomic regions, not just known genes, expanding the discovery potential.

3

Candidate Identification

Candidate proteins are identified based on PSM quality and genomic context, prioritizing those with high confidence.

4

Validation

Validation occurs through multiple quality filters and complementary data to confirm the novel protein's existence and function.

This method successfully identified 37 non-annotated protein candidates in an artificial gut microbiome model, with half having functional homologs in other species and six representing likely bona fide novel proteins1 .

The Toolkit for Protein Engineering: Essential Research Tools

Creating novel proteins requires specialized tools and techniques.

Tool/Technique Function Application in Protein Research
Mass Spectrometry Identifies and quantifies proteins based on mass-to-charge ratios Comprehensive characterization of proteins without needing specific targets upfront9
Cell-Free Protein Synthesis Produces proteins without living cells Rapid synthesis of membrane proteins in native-like lipid environments
Solid Supported Membrane-based Electrophysiology (SSME) Measures electrical activity of transporter proteins Direct, real-time functional characterization of membrane proteins
Flow-Based Synthesis Automated protein chain assembly Rapid production of protein chains over 100 amino acids long in hours5
Spatial Proteomics Maps protein expression in intact tissues Maintains spatial context of protein function within biological systems9
Advanced Imaging

High-resolution techniques visualize protein structures at atomic level, enabling precise design.

Computational Design

AI and machine learning predict how amino acid sequences will fold into functional 3D structures.

High-Throughput Screening

Automated systems test thousands of protein variants to identify those with desired functions.

Case Study: Engineering Better Transporters

A novel workflow for characterizing membrane proteins

A recent study from Irina Borodina's lab demonstrated an efficient workflow that combines continuous-exchange cell-free protein synthesis (CECF) with solid supported membrane-based electrophysiology (SSME). This approach addresses one of the biggest challenges in protein engineering: functionally characterizing membrane proteins, which are often difficult to work with using traditional methods.

Transporters are vital membrane proteins responsible for nutrient uptake, waste removal, and maintaining ion balance—making them important targets for drug development and biotechnology. The novel workflow enables their characterization in just five days, significantly faster than conventional approaches.

Step-by-Step Through the Experiment

The researchers followed a meticulous process to express and characterize five diverse transporters:

Step 1: Protein Production

Using CECF, researchers synthesized target proteins in the presence of nanodiscs, which provide a native-like lipid environment for proper membrane protein folding.

Step 2: Incorporation

The synthesized proteins were incorporated into proteoliposomes (lipid vesicles containing proteins).

Step 3: Functional Analysis

Using SSME, researchers measured the electrical activity of transporters, obtaining real-time data on their function.

Step 4: Kinetic Characterization

The team evaluated key parameters including KM (Michaelis constant, indicating substrate affinity), IMAX (maximum current), and pH dependency for each transporter.

Results and Implications

The study successfully characterized five diverse transporters, including drug/H+ antiporters and sugar transporters. The data revealed important functional insights:

Transporter Type Key Substrate Notable Characteristics
EmrE Drug/H+ antiporter Tetrapropylammonium Shows clear pH-dependent transport kinetics
SugE Drug/H+ antiporter Various compounds Maintains functionality even after precipitation and refolding
LacY Lactose permease Lactose, lactulose Distinguishes between similar sugar substrates
NhaA Na+/H+ antiporter Sodium ions Thermostabilized variant shows improved stability
AAC2 ADP/ATP carrier ATP/ADP Shows both steady-state and pre-steady state currents

This workflow represents a significant advancement, providing a robust, efficient method for the direct functional assessment of transporter proteins that could accelerate drug discovery and basic research.

The Future of Protein Design

Emerging technologies and applications

The field of protein design is advancing rapidly, thanks to several technological breakthroughs:

Benchtop Protein Sequencers

Instruments like Quantum-Si's Platinum® Pro are making protein sequencing more accessible, allowing researchers to identify amino acids in peptides without specialized expertise9 .

Large-Scale Proteomics

Projects linking proteomics data with genetic information from hundreds of thousands of samples are uncovering associations between protein levels, genetics, and disease9 .

Automated Synthesis

MIT researchers have developed a tabletop automated flow synthesis machine that can build proteins over 100 amino acids long in hours rather than days5 . This technology also allows scientists to incorporate amino acids that don't occur naturally, expanding functional possibilities5 .

Challenges and Opportunities

Despite these advances, significant challenges remain. Protein synthesis cannot be amplified like DNA, requiring sensitive analysis methods9 . Additionally, the vast structural and functional diversity of proteins makes comprehensive analysis difficult.

Nevertheless, the potential applications are staggering. From environmental cleanup proteins that break down pollutants to therapeutic proteins that target diseases with unprecedented precision, the ability to design novel proteins promises to transform our approach to many global challenges. As one researcher noted, this technology "paves the way for a new field of protein medicinal chemistry" and provides "new opportunities for rapid discovery of peptide- and protein-based biopharmaceuticals"5 .

Conclusion: The Age of Designed Proteins

The ability to design novel proteins from scratch represents a fundamental shift in our relationship with the molecular machinery of life. We're transitioning from being observers of nature's designs to becoming active architects of biological function.

As the tools and workflows for protein discovery and engineering become more sophisticated and accessible, we stand at the threshold of a new era in biotechnology. The proteins we build today may become the therapeutics, environmental solutions, and technologies of tomorrow—proof that sometimes, the smallest creations can have the largest impact.

References