
Apago PDF Enhancer
2 4 . 6
Genome Size
and Gene Number
Learning Outcome
Explain why genome size and gene number do not 1.
correlate.
Genome size was a major factor in selecting which genomes
would be sequenced first. Practical considerations led to the
choice of organisms with relatively small genomes. Consider-
ing genome size, the original gene count for the human ge-
nome was estimated at 100,000 genes.
As sequence data were analyzed, the predicted number
of genes started to decrease. A very different picture emerged.
Our genome has only 25% of the 100,000 anticipated genes,
approximately the same number of genes as the tiny Arabidopsis
plant. Humans have nine times the amount of DNA found in
the 3.65 × 10
8
bp -pufferfish genome, but about the same num-
ber of genes. Keep in mind that the number of genes may not
correspond to the number of proteins. For example, alternative
splicing (see chapter 16) can produce multiple, distinct tran-
scripts from a single gene.
Noncoding DNA in ates genome size
Why do humans have so much extra DNA? Much of it appears
to be in the form of introns, noncoding segments within a gene’s
sequence, that are substantially bigger than those in pufferfish.
The Fugu genome has only a handful of “giant” genes contain-
ing long introns; studying them should provide insight into the
evolutionary forces that have driven the change in genome size
during vertebrate evolution.
As described earlier, large expanses of retrotransposon
DNA contribute to the differences in genome size from one spe-
cies to another. Although part of the genome, ncDNA does not
contain genes in the usual sense. As another example, Drosophila
exhibits less ncDNA than Anopheles, although the evolutionary
force driving this reduction in noncoding regions is unclear. The
number of genes are not correlated with genome size.
Plants have widely varying genome size
Plants have an even greater range of genome sizes. As much as
a 200-fold difference has been found, yet all these plants weigh
in with about 30,000 to 59,000 genes. Tulips for example, have
170 times more DNA than Arabidopsis.
Both rice and Arabidopsis have higher copy numbers for
gene families (multiple slightly divergent copies of a gene) than
are seen in animals or fungi, suggesting that these plants have
undergone numerous episodes of polyploidy, segmental dupli-
cation, or both during the 150 to 200 million years since rice
and Arabidopsis diverged from a common ancestor.
Whole-genome duplication is insufficient to explain the
size of some genomes. Wheat and rice are very closely related
and have similar gene content, and yet the wheat genome is
40 times larger than the rice genome. This difference cannot be
explained solely by the fact that bread wheat is a hexaploid (6n)
and rice is a diploid (2n).
Now that the rice genome is fully sequenced, attention has
shifted to sequencing the other cereal grains, especially maize
and wheat, both of which apparently contain lots of repetitive
DNA, which has increased their DNA content, but not neces-
sarily their gene content. Comparisons between the rice, maize,
and wheat genomes should provide clues about the genome of
their common ancestor and the dynamic evolutionary balance
between opposing forces that increase genome size (polyploidy,
transposable element proliferation, and gene duplication) and
those that decrease genome size (mutational loss).
Learning Outcome Review 24.6
Increases or decreases in genome size do not correlate with the number
of genes. Evidently DNA content is not the same as gene content.
Polyploidy in plants does not by itself explain diff erences in genome size.
Often a greater amount of DNA is explained by the presence of introns
and nonprotein-coding sequences than by gene duplicates.
How might a genome with a small number of genes and ■
a small number of total base pairs evolve into a genome
with the same small number of genes and a thousand-fold
larger genome?
have some function. The point is that the differences in the
“junk” DNA between mouse and human are too small to
have resulted from genetic drift.
The possibility that this DNA is rich in regulatory RNA
sequences is being actively investigated. RNAs that are not
translated can play several roles, including silencing other genes.
Small RNAs can form double-stranded RNA with complemen-
tary mRNA sequences, blocking translation. They can also par-
ticipate in the targeted degradation of RNAs. For details on
other possible functions of ncDNA, refer to chapter 16 .
In one study, researchers collected almost all of the RNA
transcripts made by mouse cells taken from every tissue. Al-
though most of the transcripts coded for mouse proteins, as
many as 4280 could not be matched to any known mouse pro-
tein. This finding suggests that a large part of the transcribed
genome consists of genes that do not code for proteins—that
is, transcripts that function as RNA. Perhaps this function can
explain why a single retrotransposon can cause heritable differ-
ences in coat color in mice.
Learning Outcome Review 24.5
DNA that does not code for protein may regulate gene expression, often
through its RNA transcript. Nonprotein-coding sequences can be found in
retrotransposon-rich regions of the genome.
How would you determine whether RNA produced by a ■
nonprotein-coding gene has a regulatory function?
486
part
IV
Evolution
rav32223_ch24_474-491.indd 486rav32223_ch24_474-491.indd 486 11/12/09 2:49:51 PM11/12/09 2:49:51 PM