A Primer to Molecular Genetics - The Unbelievable Uncertainty

  

What makes you, you 🤔? 


We can see two girls in the picture. It immediately becomes apparent that they do not possess the same characteristics. One has black hair and the other is a blonde. One is slimmer than the other. Everyone has their own set of physical features that define that person. But have you ever stopped to wonder, what is it that determines these definitions, like the colour of your eyes, the curl of your hair, or your risk of developing certain diseases? And what is that factor that makes one slightly different from ones identical twin?

The answer lies in the incredible world of molecular genetics.

This field attempts to explain how genes impact the features of an organism (by analysing the structure and function and how they are controlled), while explaining the reason for genetic mutations (favourable or unfavourable changes in the instructions which dictate the working of a living organism), diagnose various unheard-of disorders, and more. 

It has a lot of abstract concepts, but that’s what makes it even more beautiful; its Uncertainty is Unbelievable.

Three building blocks (DNA, RNA, and genes) play a significant role in inheritance. DNA serves as a blueprint to store genetic information and RNA carries this information where the proteins required are made. Genes are segments of DNA which carry the instructions for building proteins.

Read my blogs about DNA and RNA at your leisure. 


THE GENETIC SYMPHONY :: HOW DNA ORCHESTRATES OUR UNIQUE TRAITS

DNA is made up of smaller units called genes, which provide the code for creating the proteins that do the work in our cells. 

As we know, a computer game is made using multiple code snippets (sub-functions), performing different functions, defining the graphics at each level, personality of the characters of the game, the physics of the game, the control keys etc.

Similarly, DNA is that code, with the genes being particular sub-functions of that code which defines the assets of the game (features of every living and evolving organism). 

The genes consist of three amino acids (organic compounds containing a carboxylic acid group (-COOH) and an amine group (-NH2) in a distinct, clear order, which defines the protein to be made; this is known as the 'triplet codon', which was proposed by George Gamow in 1954. 

Such a concept was introduced for the reason that there are 20 known amino acids, which are produced in the body, and a combination of four nitrogenous bases. 


Why only 20 amino acids ?

Evolution over billions of years converged to 20 amino acids to enable the formation of soluble structures with close-packed cores, allowing the presence of ordered binding pockets. 


The triplet codon system strikes a balance between complexity and efficiency. It is complex enough to code for a diverse array of amino acids while being simple enough to be efficiently read and translated by the ribosome during protein synthesis.

It was found out that if the formation of amino acids depended on one base, only 4 such acids could be formed. Two bases, 4*4= 16 such acids. But make it dependent on three bases, we get 4*4*4 combinations = 64 possible combinations which is the bare minimum as observed (since we need to represent 20 amino acids). Using three bases allows for redundancy in the genetic code, meaning that several different codons can encode the same amino acid. This redundancy can help minimize the effects of mutations (concept of degeneracy) i.e. more than one such codon can code for the same amino acid, hence allowing room for any mistake caused by our cellular machinery. 

                                  
           The set of amino acids and their codons ('U' ,for uracil, can be replaced with 'T' for thymine in DNA)                                


Now lets briefly look at how proteins are synthesized. I will discuss only about transcription and translation below.


MAKING OF THE PROTEINS

But genes don't just sit passively in our cells. They are constantly being read and used by the cellular machinery to produce the proteins the body needs. 

Unlike organisms like bacteria, human functioning is relatively complex; even now we do not have a complete understanding of the metabolism of the body or our own genome (though we may have deciphered the human genome).

For the proteins to be made, a molecule known as "mRNA" (messenger RNA) is required. This is formed through a process called transcription, which makes use of particular sequences, such as a sequence rich in cytosine and guanine, or a "TATA" box (quite obviously a region containing this given sequence; rich in thymine and adenine) as an anchor point for an enzyme, called DNA-dependent RNA polymerase. 


An illustration of the transcriptional unit (note the "TATA" box at the '-10' region)

Initially, the two strands are dissociated by an enzyme called "helicase", which breaks apart the helix, to facilitate easier transcription process. In eukaryotic DNA, since it is double helical, the process is divided into several parts, each taking place at a particular 'winding' (region in between two places of crossing over of the two strands) of the helix; such a part is referred to as a 'transcriptional unit'. 

This enzyme attaches on a given DNA strand, called the "coding strand", particularly the strand which runs from 5'- end to 3'- end and synthesises mRNA strands, while replacing thymine with a new base, called uracil (U). 

It may or may not utilize termination factors, proteins which allow for the detachment of the mRNA and the RNA polymerase enzyme from the DNA), such as the 'ρfactor, in case of prokaryotes or multiple enzymes in the case of most multicellular organism, for ending the process, depending on the termination sequence. For example, if the sequence towards the end is highly rich in 'U' base, then it may cause instability in the mRNA formed, hence auto-terminating the process.


The Process of Transcription


Then, this mRNA molecule is 'read' by ribosomes, which are cellular structures required for protein synthesis, through a process called 'translation'. Each ribosome is composed of two subunits:- one large and one small. 

There are specific sites in each subunit, which become a whole on combination of the two subunits. These are:- "A" site (acceptor), "P" site (Peptidyl transferase site) and "E" site (Exit site). Actually, except the "E" site, which is only found in the larger subunit, all sites are shared by the larger and smaller subunits. To actually build a polypeptide, another molecule, called 'tRNA' (transfer RNA), which carries amino acids in our body towards a specific region in the ribosome, is required. This 'region' is found in the larger subunit. The subunits and the mRNA form a kind of "protein sandwich" i.e. the mRNA is sandwiched between the ribosome, at a binding site on the small subunit.  


Formation of the mRNA-tRNA-ribosome complex; notice the distribution of the subunits (from left to right:- "E" site; "P" site; "A" site)


The tRNA carries the amino acid to the ribosome, like an adapter carrying charge to your mobile phone. It attaches to the "A" site, through an anticodon (complementary to the codon on the mRNA) loop, from where the ribosome "scrolls" through the mRNA, codon by codon. The tRNA is then transferred to the "P" site, where the amino acid is transferred to an existing chain through the dissociation and formation of a peptide bond (- CONH), which effectively changes the position of the bond, hence the name. For every codon, an amino acid is added to the string of existing amino acids. When the process terminates at the codon which signals the end, the 'string' detaches from the adapter tRNA and becomes a free polypeptide. Meanwhile, the ribosomal subunits detach from each other, preparing for the next protein synthesis chain. 

 A Brief Figure explaining the process of Transcription and Translation



PROTEINS : THE SUPREME POWER BEHIND US

Polypeptides are highly crucial in the expression of several physical and mental traits : eg  the growth hormone, a protein-based hormone produced by our anterior pituitary gland, is made through a complex transcriptional process. This hormone, like the name suggests, affects our height and overall physical growth.

The best part about this is that this can be regulated, in accordance to the bodily requirements of any being, and hence can cause variations in the expression of a gene. This dynamic process of gene expression is what allows our bodies to adapt and respond to changing needs, which is a key part of how our bodies maintain health and homeostasis.

For example, when your skin is wounded, your skin cells will ramp up the expression of genes that code for proteins involved in the healing process, like collagen to close the wound and immune proteins to prevent infection. Can you think of other examples where changes in gene expression might be important?

One of the most relatable examples in recent times is the COVID-19 pandemic, especially in the years 2020-21, when the first two waves wreaked havoc and initiated a slew of lockdowns across the world. We all know how everyone used to keep talking about RT-PCR, to get to know if we are affected or not. But have you ever stopped to wonder what it really is? 

What is RT-PCR ?

Of course all the curious people or the ‘geeks’ will have, but for those who have not, it stands for “Reverse Transcription- Polymerase Chain Reaction". It is basically extracting a sample of mucus in the nose, where the virus potentially builds up, then using an enzyme called ‘reverse transcriptase’ to convert potentially expressed viral RNA into DNA, the opposite of the process of transcription explained, with the help of 'dNTPs' (deoxynucleoside triphosphates), which are the nitrogenous baseslacking the 2'- OH group (numbered clockwise) attached to three phosphate groups,  attach synthetic, small complementary DNA fragments to effectively magnify any viral DNA present,  tagging it with a fluorescent dye, running it under a chemically controlled culture and seeing if the fluorescence exceeds a given threshold. It is much like testing the level and quality of oil using a dipstick. The steps involved in RT-PCR are illustrated below.



A deoxynucleoside triphosphate (notice the missing 2' '-OH' group)


THE HUMAN GENOME PROJECT IN SHORT

The human genome - the full set of DNA in our cells - is incredibly complex, containing billions of DNA building blocks (called bond pairs; 3 billion of these are present). Unravelling this complexity was a major focus of research in molecular genetics. Back in 1990, groups of some of the most prominent scientists from across the world joined forces to try and decode this complex molecule that formed the fundamentals of what we are, in what was called "The Human Genome Project". In the end, after 13 long years of research, they did manage do decode most of the genome, except for a few regions, which were considered very difficult to read. Even those regions were soon deciphered in 2022, by the "Telomere-2-Telomere" consortium, a community-based effort which had the same aim as the Human Genome Project- to decipher the whole human genome. These projects revealed new gene sequences, that allowed us to understand more about their roles in immunity and physical expression, while also revealing new secrets, such as GPRIN2, which controlled cell signalling during the process of transferring genetic material between cells (called transduction) and the growth of neurons, the cells which form our nerves. Thanks to these efforts, we gained an improved understanding of the human genome and the 25,000 genes that code for proteins. Currently, an expansion of the same project, called the 'human pangenome project' has been underway, since 2023, to understand the common sequences shared by all humans of different races, ethnicities and ancestries, and to mark out all the differences in the genome sequence. 


CONCLUSION

This knowledge is already leading to exciting breakthroughs, like new treatments for genetic disorders and personalised cancer therapies tailored to a patient's unique genetic profile. While the details of DNA and protein synthesis may seem daunting at first, the core principles of molecular genetics are well within reach for anyone curious to learn more. By further demystifying these fundamental building blocks of life, we can all appreciate the elegant complexity of the world within each of our cells.

Keep imagining about the potential applications of molecular genetics.  

Till then, goodbye!



Comments