AI In Protein Design

Ontario Youth Medical Society
7 min readNov 19, 2023

--

Image by Author

Artificial Intelligence, commonly referred to as AI, has certainly come a long way from technological developments, phone apps, smartphones, the Internet etc., however, it has now become an integral part of scientific development and will become an even more crucial component in making advances in the field of medicine and biology. AI has opened up an entire new world of possibilities for scientific development and research has integrated AI practices in a new way that has never been done before. Whether it be areas of the body that have limited accessibility through in vivo experiments, or it be microscopic particles and molecules on a cellular level that cannot be perceived by the human eye, AI has opened up the possibility to research these areas that have been deemed rather inaccessible. One area that I am personally fascinated with regarding AI development in the field of biology, is protein dynamics and protein structures. AI has advanced the ability to understand and predict the protein structure that will result from a simple amino acid synthesis. However, before going into this method in more detail, I will provide a short background as to what protein structure and synthesis all means from a biological lens!

Proteins, also known as polypeptides, are made from stringing together amino acids, which are known as the building blocks of proteins. They are, quite literally, the building blocks, as the protein structure, just like a large Lego tower, comes from smaller individual blocks that have unique properties, like the individual Lego pieces that fit nicely together, to form larger proteins. Amino acids are synthesized in the cell and are often transported to structures in the cell called ribosomes that will string together the amino acids depending on a genetically determined transcript that would encode for a specific protein. The amino acid sequence that codes for a protein is the simplest form of the protein, also known as the primary structure. What AI can specifically identify is not necessarily the primary protein structure, but rather the tertiary and quaternary structure. The secondary structure is also highly dependent on the amino acid structure but is relatively understood depending on trends in amino acids within the primary sequence. However, each protein and specifically, each set of amino acids have a unique way of coming together which is dependent on the properties of the amino acids, their size, their electric charges and other features, and there are many ways for this to occur as well as cellular machinery that ensure that it occurs properly. Failure for a protein to fold in the correct shape will result in malfunctioning proteins and catastrophe in the cell. Finally, the quaternary structure involves the joining of multiple smaller proteins together to form a larger protein where individual subunits are smaller proteins that have successfully folded to the tertiary level. The image below provides a visual representation of the differences between the tertiary and quaternary levels.

Artificial Intelligence, commonly referred to as AI, has certainly come a long way from technological developments, phone apps, smartphones, the Internet etc., however, it has now become an integral part of scientific development, and will become an even more crucial component in making advances in the field of medicine and biology. AI has opened up an entire new world of possibilities for scientific development and research has integrated AI practices in a new way that has never been done before. Whether it be areas of the body that have limited accessibility through in vivo experiments, or it be microscopic particles and molecules on a cellular level that cannot be perceived by the human eye, AI has opened up the possibility to research these areas that have been deemed rather inaccessible. One area that I am personally fascinated with regarding AI development in the field of biology, is in protein dynamics and protein structures. AI has advanced the ability to now understand and predict the protein structure that will result from a simple amino acid synthesis. However, before going into this method in more detail, I will provide a short background as to what protein structure and synthesis all means from a biological lens!

Proteins, also known as polypeptides, are made from stringing together amino acids, which are known as the building blocks of proteins. They are, quite literally, the building blocks, as the protein structure, just like a large Lego tower, comes from smaller individual blocks that have unique properties, like the individual Lego pieces that fit nicely together, to form larger proteins. Amino acids are synthesized in the cell and are often transported to structures in the cell called ribosomes that will string together the amino acids depending on a genetically determined transcript that would encode for a specific protein. The amino acid sequence that codes for a protein is the simplest form of the protein, also known as the primary structure. What AI can specifically identify is not necessarily the primary protein structure, but rather the tertiary and quaternary structure. The secondary structure is also highly dependent on the amino acid structure but is relatively understood depending on trends in amino acids within the primary sequence. However, each protein and specifically, each set of amino acids have a unique way of coming together which is dependent on the properties of the amino acids, their size, their electric charges and other features, and there are many ways for this to occur as well as cellular machinery that ensure that it occurs properly. Failure for a protein to fold in the correct shape will result in malfunctioning proteins and catastrophe in the cell. Finally, the quaternary structure involves the joining of multiple smaller proteins together to form a larger protein where individual subunits are smaller proteins that have successfully folded to the tertiary level. The image below provides a visual representation of the differences between the tertiary and quaternary levels.

Image from: https://www.geeksforgeeks.org/protein-structure-primary-secondary-tertiary-quaternary/

As you can see, the folding of the protein after being synthesized from a genetic transcript is very complex, and so you might be wondering, how exactly can AI be used to predict the protein structure, specifically the tertiary and quaternary structures? David Baker and a team of biochemists at the University of Washington have established a method on how to design functional proteins that they could synthesize in liver cells. He states that AI has the power to develop something within a couple of years of research, that 3 billion or more years has not been able to establish evolutionarily. The tool used specifically is called RF diffusion. The tool is a neural network and can create custom proteins that can be used in a wide range of treatment protocols in biomaterials, vaccines, pharmaceutical treatment etc. One extremely beneficial part of this software is that it can draw up new proteins that tightly bind to other biomolecules. An example of an RF diffusion-produced protein that binds to the parathyroid hormone (in pink) is shown below.

Image from: https://www.nature.com/articles/d41586-023-02227-y

Specifically looking at the structure, you can tell that the amino acid sequence that the software begins with can fold into a specific shape that is theoretically supposed to bind to another protein. In terms of protein binding to other biomolecules, typically binding sites exist with grooves and folds which must match that of the binding protein to fit; like a puzzle, without the right corners and edges the two pieces cannot connect. Therefore, this AI-based tool can predict the amino acid sequence that will fold into a specific shape with binding sites that will successfully bind to the exact sites of connection on the biomolecule of interest.

It is fascinating to see how promising the future looks for the advancement of medicine. Rare diseases are no longer threatened by the fact that nothing is available to compare and research them against because AI is now available to create them as a custom protein! I hope this blog provides some insight into the ways that AI is serving as an integral tool in the progression of biological and biomedical science research to increase the efficiency of these processes for future treatments!

Sources:

About the Author:

Wynter Sutchy is a third-year undergraduate student at McMaster University studying Biology (Physiology) and is from King City, ON. She is very passionate about the healthcare field and enjoys sharing her volunteering experiences through writing. She plans to pursue a career in medicine in the future and plans to explore the field of healthcare through research and volunteering throughout her undergraduate career. In her free time, you can expect her to be watching her favourite show, Grey’s Anatomy, teaching children how to swim, or baking some delicious desserts!

--

--

Ontario Youth Medical Society

Ontario Youth Medical Society is a student-led, non-profit organization focused on educating youth and making a difference in medicine.