DNA is the Blueprint — Here’s How It Actually Works
Every cell in your body carries the same DNA. Your skin cell and your brain cell have identical genetic information, yet they behave completely differently. The reason? Gene expression — which genes get read, and when. Molecular basis of inheritance is the chapter that explains the entire machinery behind this.
This chapter covers about 8-10 marks in NEET and is consistently tested in CBSE boards. The good news: once you understand the logic of the central dogma, everything else falls into place. The questions are predictable.
We will cover DNA structure, replication, transcription, translation, the lac operon, and the Human Genome Project. Each of these has appeared in NEET PYQs multiple times. NEET 2022, 2023, and 2024 all had direct questions from this chapter.
The key mental shift: stop memorising isolated facts. Instead, understand the flow of information — DNA → RNA → Protein. Every concept in this chapter is a step in that flow, or a mechanism that regulates it.
Key Terms and Definitions
Nucleotide — The monomer unit of DNA/RNA. Made of three parts: a pentose sugar (deoxyribose in DNA, ribose in RNA), a phosphate group, and a nitrogenous base.
Purine vs Pyrimidine — Purines (Adenine, Guanine) have double rings. Pyrimidines (Cytosine, Thymine, Uracil) have single rings. Easy memory trick: pure As Gold = Purines are A and G.
Chargaff’s Rule — In any double-stranded DNA, A = T and G = C. So if A is 30%, T is 30%, and G+C together make 40%. This appears as a calculation question very often.
Antiparallel strands — The two DNA strands run in opposite directions. One runs 5’→3’, the other 3’→5’. This matters enormously for replication and transcription.
Template strand — The strand DNA polymerase actually reads (3’→5’ direction). Also called the antisense strand or non-coding strand.
Coding strand — The strand that has the same sequence as the mRNA (except T instead of U). Also called the sense strand.
Codon — A triplet of nucleotides on mRNA that codes for one amino acid. There are 64 codons total (4³).
Anticodon — The complementary triplet on tRNA that pairs with the codon.
Okazaki fragments — Short DNA fragments synthesized on the lagging strand during replication. Named after Reiji Okazaki.
Telomere — Repetitive DNA sequences (TTAGGG in humans) at chromosome ends. They protect chromosomes but shorten with each division — the “biological clock.”
Core Concepts — The Central Dogma
DNA Structure (Watson-Crick Model, 1953)
DNA is a double helix — two antiparallel polynucleotide chains wound around each other. The backbone is sugar-phosphate. The bases point inward and pair via hydrogen bonds:
A — T (2 hydrogen bonds)
G ≡ C (3 hydrogen bonds)
G-C pairs are stronger. More G-C content = higher melting temperature of DNA.
The helix has a major groove and minor groove — this matters for protein-DNA interactions (tested in JEE Advanced occasionally).
Key measurements: one full turn = 10 base pairs, pitch = 3.4 nm, distance between adjacent base pairs = 0.34 nm.
DNA Replication
Replication is semiconservative — proved by Meselson and Stahl’s experiment (1958) using ¹⁵N isotope labelling. Each daughter DNA has one original strand and one new strand.
Key enzymes in replication:
| Enzyme | Function |
|---|---|
| Helicase | Unwinds the double helix at the replication fork |
| DNA Polymerase III | Main enzyme that adds nucleotides (5’→3’ direction only) |
| Primase | Synthesizes the RNA primer to give DNA pol a starting point |
| DNA Ligase | Joins Okazaki fragments on the lagging strand |
| Topoisomerase | Relieves tension ahead of the replication fork |
Students write “DNA polymerase unwinds DNA.” Wrong — that’s helicase. DNA polymerase only adds nucleotides and needs a primer to start. This distinction comes up in both NEET and CBSE 2-mark questions.
Leading strand vs Lagging strand:
- Leading strand: synthesized continuously in the 5’→3’ direction toward the replication fork.
- Lagging strand: synthesized in short Okazaki fragments (5’→3’) moving away from the fork — hence it appears to go “backward.”
Transcription
Transcription is the synthesis of RNA from a DNA template. In prokaryotes, it happens in the cytoplasm. In eukaryotes, it happens in the nucleus, and the RNA is processed before leaving.
Initiation: RNA polymerase binds to the promoter region (TATA box in eukaryotes).
Elongation: RNA pol moves 3’→5’ along the template strand, synthesizing RNA 5’→3’. It uses ribonucleotides, not deoxyribonucleotides.
Termination: In prokaryotes, a terminator sequence causes RNA pol to fall off. In eukaryotes, a poly-A signal triggers cleavage.
Post-transcriptional processing in eukaryotes (important for NEET):
- 5’ capping — methylated guanosine added to protect mRNA and aid ribosome binding.
- 3’ polyadenylation — poly-A tail (~200 adenine nucleotides) added for stability.
- Splicing — introns (non-coding) removed, exons (coding) joined by spliceosomes.
The mature mRNA is called mRNA and the pre-processed version is hnRNA (heterogeneous nuclear RNA).
NEET 2023 asked about which RNA carries the genetic code (mRNA), which carries amino acids (tRNA), and which is a structural component of ribosomes (rRNA). Know the three RNA types and their functions cold.
Translation
Translation converts the mRNA code into a protein. It happens on ribosomes (80S in eukaryotes, 70S in prokaryotes — the S values are not additive because they measure sedimentation rate, not size).
Genetic Code properties — all are testable:
| Property | Meaning |
|---|---|
| Triplet | 3 nucleotides = 1 amino acid |
| Degenerate/Redundant | Multiple codons for same amino acid |
| Non-overlapping | Each nucleotide belongs to only one codon |
| Commaless | No gaps between codons |
| Universal | Same in almost all organisms |
| Non-ambiguous | One codon = only one amino acid |
Start codon: AUG (codes for methionine — first amino acid in all proteins) Stop codons: UAA, UAG, UGA (called nonsense codons — they don’t code for any amino acid)
Remember stop codons as UAA (U Are Awesome), UAG (U Are Gone), UGA (U Go Away). Silly, but it works in an exam hall.
Ribosome structure: Each ribosome has three sites — A site (aminoacyl — incoming tRNA), P site (peptidyl — growing chain), E site (exit — empty tRNA leaves here).
The lac Operon (Prokaryotic Gene Regulation)
The lac operon explains how E. coli controls lactose metabolism. It is a classic model of negative regulation — the default is “off,” and the repressor is removed to turn genes “on.”
Components:
- Structural genes: lacZ (β-galactosidase), lacY (permease), lacA (transacetylase)
- Operator: where the repressor binds to block transcription
- Promoter: where RNA polymerase binds
- Regulator gene (lacI): produces the repressor protein
When lactose is absent: Repressor binds to operator → RNA pol blocked → genes not expressed.
When lactose is present: Lactose (via allolactose) binds to repressor → repressor changes shape → falls off operator → RNA pol transcribes structural genes → enzymes produced to digest lactose.
NEET has asked about the lac operon in 2019, 2021, and 2024. The most common question: what is the role of the operator? Answer: it is the binding site for the repressor protein. Do not confuse operator with promoter.
Human Genome Project (HGP)
HGP ran from 1990 to 2003. Goals: sequence all 3×10⁹ base pairs of human DNA and identify all ~25,000 genes.
Key outcomes:
- ~25,000 protein-coding genes (far fewer than expected — humans have about the same number as a roundworm has)
- ~99.9% of nucleotide sequence is identical across all humans
- Less than 2% of genome codes for proteins — the rest was called “junk DNA” (now understood to have regulatory roles)
- Average gene size: 3000 bases, largest gene is dystrophin at 2.4 million bases
DNA fingerprinting uses VNTRs (Variable Number Tandem Repeats) — repetitive non-coding sequences that differ between individuals. Used in forensics, paternity testing, identifying criminals.
Solved Examples
Example 1 — Easy (CBSE Level)
Q: If the sequence of the coding strand of DNA is 5’-ATGCTTGAA-3’, what is the sequence of the mRNA?
Solution: The coding strand has the same sequence as mRNA (except T → U).
mRNA: 5’-AUGCUUGAA-3’
This codes for: Met-Leu-Glu (three amino acids, before the stop codon check).
Example 2 — Medium (NEET Level)
Q: In a double-stranded DNA, 20% of bases are Adenine. What is the percentage of Guanine?
Solution: By Chargaff’s rule: A = T = 20%
So A + T = 40%
Remaining = G + C = 60%
Since G = C, Guanine = 30%
The most common error: students write G = 60%. They forget that G = C, so each is 30%. NEET has caught thousands of students with this exact calculation.
Example 3 — Hard (NEET/JEE Advanced Level)
Q: A gene has 600 base pairs. How many amino acids does the protein have? How many codons are involved in coding?
Solution: 600 base pairs in DNA → 600 nucleotides in mRNA
Number of codons = 600 ÷ 3 = 200 codons
But: 1 codon is AUG (start, codes for Met) and 1 codon is a stop codon (codes for nothing).
So amino acids = 200 - 1 = 199 amino acids
(The start codon codes for Met, but stop codon doesn’t code for any amino acid.)
Exam-Specific Tips
CBSE / ICSE Board Exam
CBSE asks 2-mark and 3-mark questions from this chapter. Common patterns:
- “Describe the experiment that proved semiconservative replication.” (Draw the Meselson-Stahl experiment with centrifuge bands.)
- “Differentiate between template strand and coding strand.”
- “What are introns and exons?”
For 3-mark answers, always structure as: Define → Explain → Example. Diagrams fetch full marks — draw the lac operon or the replication fork with labels.
NEET Strategy
In NEET 2023, 4 questions came from this single chapter. In NEET 2024, 3 questions. It is consistently a high-weightage chapter — 3-5 questions every year. Prioritise it above Reproduction chapters.
Focus areas: enzyme functions in replication, RNA processing in eukaryotes, lac operon regulation, genetic code properties, HGP facts.
For NEET, focus on:
- Enzyme names and functions (helicase, ligase, primase, DNA pol III)
- Post-transcriptional modifications — 5’ cap, poly-A tail, splicing
- Genetic code properties — especially the terms “degenerate” and “universal”
- lac operon — what happens with/without lactose, what repressor does
- HGP numbers — 3×10⁹ bp, ~25,000 genes, 2% coding
Common Mistakes to Avoid
Mistake 1: Confusing template strand direction DNA polymerase reads the template strand 3’→5’, but synthesizes the new strand 5’→3’. Students mix these up. The new strand always grows from 5’ to 3’ end — remember this and everything follows.
Mistake 2: Writing RNA pol needs a primer Only DNA polymerase needs a primer. RNA polymerase can start a new chain on its own. This is why transcription doesn’t need primase.
Mistake 3: Saying ribosomes are 80S + 60S + 40S In eukaryotes, the 80S ribosome is made of 60S + 40S subunits. But 60 + 40 ≠ 80 because S values (Svedberg units) measure sedimentation rate, not mass. They’re not additive. In prokaryotes: 70S = 50S + 30S.
Mistake 4: Treating all 64 codons as coding for amino acids 61 codons code for amino acids. 3 are stop codons (UAA, UAG, UGA) that don’t code for anything. This matters in numerical problems about protein length.
Mistake 5: lac operon “switched on by lactose directly” Lactose doesn’t directly inactivate the repressor. First, lactose is converted to allolactose (by a few β-galactosidase molecules present at basal level), and allolactose is the actual inducer that binds the repressor. NEET 2021 tested this exact distinction.
Practice Questions
Q1 (CBSE, 1 mark): Name the enzyme that joins Okazaki fragments during DNA replication.
DNA Ligase. It forms phosphodiester bonds between the 3’-OH of one Okazaki fragment and the 5’-phosphate of the next, after the RNA primers are removed and replaced with DNA.
Q2 (CBSE, 2 marks): Distinguish between transcription in prokaryotes and eukaryotes.
| Feature | Prokaryotes | Eukaryotes |
|---|---|---|
| Location | Cytoplasm | Nucleus |
| mRNA processing | Not required | Required (capping, poly-A tail, splicing) |
| Coupling | Transcription + translation simultaneous | Sequential (transcription in nucleus, translation in cytoplasm) |
| Introns | Absent | Present |
Q3 (NEET level): In a DNA molecule, G+C content is 60%. If total number of nucleotides is 4000, how many adenine nucleotides are present?
G + C = 60%, so A + T = 40%
Since A = T, each = 20% of total nucleotides.
Adenine = 20% of 4000 = 800 nucleotides
Q4 (NEET level): Which property of the genetic code ensures that a single codon codes for only one amino acid?
Non-ambiguity. While the code is degenerate (one amino acid can have multiple codons), it is non-ambiguous — a given codon always codes for the same amino acid, never two different ones.
Q5 (CBSE, 3 marks): Explain the role of the following in DNA replication: (a) Helicase (b) Primase (c) DNA Polymerase III
(a) Helicase: Unwinds the double helix by breaking hydrogen bonds between base pairs at the replication fork. Creates the single-stranded template for DNA polymerase to work on.
(b) Primase: An RNA polymerase that synthesizes a short RNA primer (5-10 nucleotides). This primer provides the free 3’-OH group that DNA polymerase III needs to begin adding deoxyribonucleotides.
(c) DNA Polymerase III: The main replicating enzyme in prokaryotes. Reads the template 3’→5’ and synthesizes the new strand 5’→3’. It has proofreading ability (3’→5’ exonuclease activity) to remove mismatched bases.
Q6 (NEET level): In E. coli, when both glucose and lactose are present, what happens at the lac operon?
When glucose is present, cAMP levels are low (because glucose suppresses adenylyl cyclase). Low cAMP means CAP (Catabolite Activator Protein) cannot bind the promoter. Without CAP activation, RNA polymerase binds poorly even if the repressor is removed.
Result: lac operon is only weakly expressed despite lactose being present. E. coli preferentially uses glucose. This is called catabolite repression or the glucose effect.
Q7 (CBSE/NEET): What is the significance of telomeres?
Telomeres are repetitive DNA sequences (TTAGGG in humans) at chromosomal ends. They:
- Protect chromosomes from degradation and end-to-end fusion
- Compensate for the inability of DNA pol to replicate the very end of linear chromosomes
- Shorten with each cell division (related to cellular aging)
Telomerase (an enzyme with RNA template) extends telomeres in germ cells and cancer cells — maintaining their “immortality.” Somatic cells lack adequate telomerase, so their telomeres shorten over time.
Q8 (NEET Hard): A protein has 150 amino acids. Calculate the minimum number of nucleotides required in the mRNA to code for this protein. Include the start and stop codons.
Total codons needed = 150 (for amino acids) + 1 (stop codon) = 151 codons
Note: The start codon AUG codes for the first methionine, which is already counted in the 150 amino acids.
Minimum nucleotides = 151 × 3 = 453 nucleotides
(This is the minimum coding sequence. In reality, mRNA is longer due to 5’ UTR, 3’ UTR, 5’ cap, and poly-A tail — but the question asks for minimum to code the protein.)
Frequently Asked Questions
What is the difference between replication, transcription, and translation?
Replication copies DNA → DNA (before cell division). Transcription copies DNA → RNA (makes mRNA from a gene). Translation reads mRNA → Protein (at the ribosome). These are the three steps of the central dogma.
Why is DNA replication called semiconservative?
Because each daughter molecule keeps (“conserves”) one original strand and makes one new strand. “Semi” because only half is conserved. Meselson and Stahl proved this in 1958 by tracking heavy nitrogen (¹⁵N) over generations — the classic experiment CBSE loves to ask about.
What is the central dogma of molecular biology?
Proposed by Francis Crick: DNA → RNA → Protein. Information flows from DNA to RNA (transcription) and from RNA to protein (translation). Reverse transcription (RNA → DNA, seen in retroviruses) is an exception. Protein information never flows back to nucleic acids.
Why do we need RNA primers for DNA replication but not for transcription?
DNA polymerase can only extend an existing strand — it cannot start a new chain from scratch. It needs a free 3’-OH group to add to. RNA polymerase can start new chains de novo (from scratch). So transcription needs no primer, but DNA replication does.
What are introns and exons, and why does eukaryotic DNA have them?
Exons are coding sequences that appear in mature mRNA. Introns are intervening non-coding sequences that are transcribed but then spliced out. The evolutionary benefit is debated — introns allow alternative splicing (one gene → multiple proteins) and may have served as “mobile genetic elements.” Prokaryotes generally lack introns.
How many chromosomes does a human cell have, and how much DNA?
46 chromosomes (23 pairs) in a somatic cell. Total DNA: ~3×10⁹ base pairs per haploid genome (6×10⁹ per diploid cell). If stretched end to end, DNA from one cell would be about 2 meters long — packed into a nucleus ~6 micrometers in diameter. This packaging is achieved through nucleosomes (DNA + histones).
What makes VNTR useful for DNA fingerprinting?
VNTRs (Variable Number Tandem Repeats) are non-coding sequences that repeat in tandem. The number of repeats varies from person to person and is inherited. When DNA is cut with restriction enzymes and separated by electrophoresis, each person shows a unique pattern of bands — a “fingerprint.” The probability of two unrelated people having the same VNTR pattern is less than 1 in 30 billion.
Is the genetic code truly universal?
Almost. The genetic code is the same in bacteria, plants, animals, and fungi — evidence of common ancestry. But there are minor exceptions: in human mitochondria, UGA codes for tryptophan instead of acting as a stop codon, and AUA codes for methionine instead of isoleucine. These exceptions are rare and typically not tested at NEET level unless specifically asked.