The Quest to Create Novel Proteins
In the quest to design life's molecular machinery from the ground up, scientists are now engineering proteins unlike anything found in nature.
Explore the ScienceImagine being able to design custom proteins—microscopic machines that could target diseases with precision, break down environmental toxins, or build new materials molecule by molecule. This isn't science fiction; it's the cutting edge of scientific research happening in laboratories today.
For decades, scientists studied proteins that already existed in nature. Now, they're learning to build entirely new ones from scratch, creating custom molecular machines with functions we design. This revolutionary approach could transform medicine, environmental science, and technology, offering solutions to some of humanity's most pressing challenges.
Proteins are the workhorses of all living organisms, performing countless essential functions.
These molecular machines provide cellular structure, speed up chemical reactions as enzymes, transport materials throughout the body, and regulate biological processes.
The process of protein synthesis, called translation, occurs in three phases: initiation, elongation, and termination8 . During initiation, ribosomal subunits identify the start codon on the mRNA. Elongation involves the sequential addition of amino acids, guided by codon-anticodon pairing. Finally, termination occurs when the ribosome encounters a stop codon, releasing the newly formed protein8 .
A protein's function depends entirely on its three-dimensional shape, which is determined by its amino acid sequence. This sequence folds into a specific structure, often described as having primary, secondary, tertiary, and quaternary levels of organization. For decades, scientists have been trying to crack the code of how a linear chain of amino acids folds into its functional shape—a challenge known as the "protein folding problem."
Recent advances in artificial intelligence, like DeepMind's AlphaFold, have dramatically improved our ability to predict protein structures. But predicting the structure of a protein that already exists is one thing—designing entirely new sequences that fold into stable, functional shapes is an entirely different challenge.
Scientists have developed sophisticated workflows to identify and validate proteins that conventional methods might miss. One breakthrough approach, published in BMC Bioinformatics, uses proteogenomics—a method that combines proteomics and genomics to discover novel proteins1 .
This technique is particularly valuable for identifying small proteins (sProteins), which are often overlooked in traditional genome annotations because they don't meet length requirements or resemble known proteins1 . These small proteins, some as short as 37 amino acids, perform critical biological functions despite their size, such as regulating energy production or controlling glucose uptake1 .
The proteogenomic workflow maps peptide-spectrum matches (PSMs) directly to genomic locations, bypassing annotation biases1 . Here's how it works:
Mass spectrometry data is collected from protein samples to analyze their composition and structure.
Peptide-spectrum matches are mapped to all possible genomic regions, not just known genes, expanding the discovery potential.
Candidate proteins are identified based on PSM quality and genomic context, prioritizing those with high confidence.
Validation occurs through multiple quality filters and complementary data to confirm the novel protein's existence and function.
This method successfully identified 37 non-annotated protein candidates in an artificial gut microbiome model, with half having functional homologs in other species and six representing likely bona fide novel proteins1 .
Creating novel proteins requires specialized tools and techniques.
| Tool/Technique | Function | Application in Protein Research |
|---|---|---|
| Mass Spectrometry | Identifies and quantifies proteins based on mass-to-charge ratios | Comprehensive characterization of proteins without needing specific targets upfront9 |
| Cell-Free Protein Synthesis | Produces proteins without living cells | Rapid synthesis of membrane proteins in native-like lipid environments |
| Solid Supported Membrane-based Electrophysiology (SSME) | Measures electrical activity of transporter proteins | Direct, real-time functional characterization of membrane proteins |
| Flow-Based Synthesis | Automated protein chain assembly | Rapid production of protein chains over 100 amino acids long in hours5 |
| Spatial Proteomics | Maps protein expression in intact tissues | Maintains spatial context of protein function within biological systems9 |
High-resolution techniques visualize protein structures at atomic level, enabling precise design.
AI and machine learning predict how amino acid sequences will fold into functional 3D structures.
Automated systems test thousands of protein variants to identify those with desired functions.
A novel workflow for characterizing membrane proteins
A recent study from Irina Borodina's lab demonstrated an efficient workflow that combines continuous-exchange cell-free protein synthesis (CECF) with solid supported membrane-based electrophysiology (SSME). This approach addresses one of the biggest challenges in protein engineering: functionally characterizing membrane proteins, which are often difficult to work with using traditional methods.
Transporters are vital membrane proteins responsible for nutrient uptake, waste removal, and maintaining ion balance—making them important targets for drug development and biotechnology. The novel workflow enables their characterization in just five days, significantly faster than conventional approaches.
The researchers followed a meticulous process to express and characterize five diverse transporters:
Using CECF, researchers synthesized target proteins in the presence of nanodiscs, which provide a native-like lipid environment for proper membrane protein folding.
The synthesized proteins were incorporated into proteoliposomes (lipid vesicles containing proteins).
Using SSME, researchers measured the electrical activity of transporters, obtaining real-time data on their function.
The team evaluated key parameters including KM (Michaelis constant, indicating substrate affinity), IMAX (maximum current), and pH dependency for each transporter.
The study successfully characterized five diverse transporters, including drug/H+ antiporters and sugar transporters. The data revealed important functional insights:
| Transporter | Type | Key Substrate | Notable Characteristics |
|---|---|---|---|
| EmrE | Drug/H+ antiporter | Tetrapropylammonium | Shows clear pH-dependent transport kinetics |
| SugE | Drug/H+ antiporter | Various compounds | Maintains functionality even after precipitation and refolding |
| LacY | Lactose permease | Lactose, lactulose | Distinguishes between similar sugar substrates |
| NhaA | Na+/H+ antiporter | Sodium ions | Thermostabilized variant shows improved stability |
| AAC2 | ADP/ATP carrier | ATP/ADP | Shows both steady-state and pre-steady state currents |
This workflow represents a significant advancement, providing a robust, efficient method for the direct functional assessment of transporter proteins that could accelerate drug discovery and basic research.
Emerging technologies and applications
The field of protein design is advancing rapidly, thanks to several technological breakthroughs:
Instruments like Quantum-Si's Platinum® Pro are making protein sequencing more accessible, allowing researchers to identify amino acids in peptides without specialized expertise9 .
Projects linking proteomics data with genetic information from hundreds of thousands of samples are uncovering associations between protein levels, genetics, and disease9 .
MIT researchers have developed a tabletop automated flow synthesis machine that can build proteins over 100 amino acids long in hours rather than days5 . This technology also allows scientists to incorporate amino acids that don't occur naturally, expanding functional possibilities5 .
Despite these advances, significant challenges remain. Protein synthesis cannot be amplified like DNA, requiring sensitive analysis methods9 . Additionally, the vast structural and functional diversity of proteins makes comprehensive analysis difficult.
Nevertheless, the potential applications are staggering. From environmental cleanup proteins that break down pollutants to therapeutic proteins that target diseases with unprecedented precision, the ability to design novel proteins promises to transform our approach to many global challenges. As one researcher noted, this technology "paves the way for a new field of protein medicinal chemistry" and provides "new opportunities for rapid discovery of peptide- and protein-based biopharmaceuticals"5 .
The ability to design novel proteins from scratch represents a fundamental shift in our relationship with the molecular machinery of life. We're transitioning from being observers of nature's designs to becoming active architects of biological function.
As the tools and workflows for protein discovery and engineering become more sophisticated and accessible, we stand at the threshold of a new era in biotechnology. The proteins we build today may become the therapeutics, environmental solutions, and technologies of tomorrow—proof that sometimes, the smallest creations can have the largest impact.