Which genomes have almost no introns
Sign up for our email newsletter. Already a subscriber? Sign in. Thanks for reading Scientific American. Create your free account or Sign in to continue. See Subscription Options. Go Paperless with Digital. This question relates to a curious feature of how genetic information is organized in the DNA of many organisms. The sequence of bases that make up DNA encode a corresponding sequence of amino acids which make up proteins.
Molecular biologists had at first assumed that in a gene, all the DNA coding for a protein would be continuous, and that is what they found when they first looked at the genes of prokaryotes bacteria and other simple cells.
When researchers looked at more complex eukaryotic cells, however, they found that the encoding DNA is typically discontinuous: stretches of encoding DNA called exons are interspersed with long stretches of non-encoding DNA called introns. Although introns have sometimes been loosely called "junk DNA," the fact that they are so common and have been preserved during evolution leads many researchers to believe that they serve some function.
Get smart. Sign Up. Support science journalism. Knowledge awaits. Diem, M. PYM binds the cytoplasmic exon-junction complex and ribosomes to enhance translation of spliced mRNAs. Dredge, B. Dreyfuss, G. Messenger-RNA-binding proteins and the messages they carry. Durand, C. Filipowicz, W. Biogenesis of small nucleolar ribonucleoproteins.
Cell Biol. Fisher, S. Conservation of RET regulatory function from human to zebrafish without sequence similarity. Science , — Fong, Y. Stimulatory effect of splicing factors on transcriptional elongation. Forget, A. Transcription 2, 86— Fortes, P.
Friend, K. Cell 28, — Furger, A. Promoter proximal splice sites enhance transcription. Furth, P. Gaunitz, F. An intronic silencer element is responsible for specific zonal expression of glutamine synthetase in the rat liver.
Hepatology 41, — A silencer element in the first intron of the glutamine synthetase gene represses induction by glucocorticoids. Gilbert, W. The exon theory of genes. Cold Spring Harb. Gilligan, P. Gilson, P. Gingeras, T. Origin of phenotypes: genes and transcripts. Gonzalez, C. Cell 5, — Graur, D. Fundamentals of Molecular Evolution , 2nd Edn.
Sunderland, MA: Sinauer Associates. Graveley, B. Alternative splicing: increasing diversity in the proteomic world. Trends Genet. Green, R. Widespread predicted nonsense-mediated mRNA decay of alternatively-spliced transcripts of human normal and disease genes. Bioinformatics 19 Suppl. Gruss, P. Splicing as a requirement for biogenesis of functional 16S mRNA of simian virus Gubb, D. Intron-delay and the precision of expression of homoeotic gene products in Drosophila.
Gunderson, S. Cell 1, — Involvement of the carboxyl terminus of vertebrate poly A polymerase in U1A autoregulation and in the coupling of splicing and polyadenylation. Hachet, O. Drosophila Y14 shuttles to the posterior of the oocyte and is required for oskar mRNA transport.
Splicing of oskar RNA in the nucleus is coupled to its cytoplasmic localization. Hamer, D. SV40 recombinants carrying rabbit beta-globin gene coding sequences.
Cell 17, — Hansen, K. Genome-wide identification of alternative splice forms down-regulated by nonsense-mediated mRNA decay in Drosophila. PLoS Genet. Hartmann, B. Havlioglu, N. An intronic signal for alternative splicing in the human genome.
Hillman, R. An unappreciated role for RNA surveillance. Hinske, L. BMC Genomics 11, Hood, J. In or out? Regulating nuclear transport. Horne-Badovinac, S. Dynein regulates epithelial polarity and the apical localization of stardust A mRNA. Huang, Z. Genome-wide analyses of two families of snoRNA genes from Drosophila melanogaster , demonstrating the extensive utilization of introns for coding of snoRNAs. Jin, Y. EMBO J. Jobert, L. EMBO Rep. Juneau, K.
Introns regulate RNA and protein abundance in yeast. Genetics , — Kaida, D. Khodor, Y. II, and Rosbash, M. Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila. Kim, Y. Processing of intronic microRNAs. Ko, B. Identification of new poly A polymerase-inhibitory proteins capable of regulating pre-mRNA polyadenylation.
Kohler, A. Exporting RNA from the nucleus to the cytoplasm. Koonin, E. The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate?
Direct 1, Intron-dominated genomes of early ancestors of eukaryotes. Kumar, A. An overview of nested genes in eukaryotic genomes. Eukaryotic Cell 8, — Kwek, K.
Kyburz, A. Cell 23, — Lane, C. Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein structure and function. Lane, N. The energetics of genome complexity. Lareau, L. The evolving roles of alternative splicing. Lauderdale, J. Introns of the chicken ovalbumin gene promote nucleosome alignment in vitro. Le Hir, H. The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay.
The spliceosome deposits multiple proteins nucleotides upstream of mRNA exon-exon junctions. How introns influence and enhance eukaryotic gene expression. Trends Biochem. EJCs at the heart of translational control. Lee, H. Exon junction complex enhances translation of spliced mRNAs at multiple steps.
Lee, Y. LeMaire, M. Splicing precedes polyadenylation during Drosophila E74A transcription. Lewis, B. Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Lin, S. The splicing factor SC35 has an active role in transcriptional elongation. Liu, K. Rat growth hormone gene introns stimulate nucleosome alignment in vitro and in transgenic mice.
Lopez-Bigas, N. Are splicing mutations the most frequent cause of hereditary disease? FEBS Lett. Lu, H. Predicting functional alternative splicing by measuring RNA selection pressure from multigenome alignments. Lu, S. Analysis of the stimulatory effect of splicing on mRNA production and utilization in mammalian cells.
RNA 9, — Luehrsen, K. Intron enhancement of gene expression and the splicing efficiency of introns in maize cells. Luo, M. Splicing is required for rapid and efficient mRNA export in metazoans. Lutz, C. Interaction between the U1 snRNP-A protein and the kD subunit of cleavage-polyadenylation specificity factor increases polyadenylation efficiency in vitro. Lykke-Andersen, J. Human Upf proteins target an mRNA for nonsense-mediated decay when bound downstream of a termination codon.
Lynch, M. Intron evolution as a population-genetic process. The Origins of Genome Architecture , 1st Edn. Sunderland: Sinauer Associates Inc. Ma, X. Mair, G. RNA 6, — Maiuri, P. Makarova, J. Genomics 94, 11— Maquat, L. Martin, K. Martin, W. Introns and the origin of nucleus-cytosol compartmentalization. Nature , 41— Mascarenhas, D. Intron-mediated enhancement of heterologous gene expression in maize.
Plant Mol. Masuda, S. Mattick, J. The evolution of controlled multitasked gene networks: the role of introns and other noncoding RNAs in the development of complex organisms. Maxwell, E. The small nucleolar RNAs. McCracken, S. McGlincy, N. Alternative splicing resulting in nonsense-mediated mRNA decay: what is the meaning of nonsense? McKay, S. An investigation of a role for U2 snRNP spliceosomal components in regulating transcription. Millevoi, S. Mitrovich, Q. Mohr, S. Moore, M.
From birth to death: the complex lives of eukaryotic mRNAs. Pre-mRNA processing reaches back to transcription and ahead to translation. Morello, L. Testing the IMEter on rice introns and other aspects of intron-mediated enhancement of gene expression.
Morrison, H. Genomic minimalism in the early diverging intestinal parasite Giardia lamblia. Nagy, E. A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. Nahkuri, S. Nucleosomes are preferentially positioned at exons in somatic and sperm cells. Cell Cycle 8, — Nguyen, H. New maximum likelihood estimators for eukaryotic intron evolution.
Nilsen, T. Expansion of the eukaryotic proteome by alternative splicing. Nott, A. Splicing enhances translation in mammalian cells: an additional function of the exon junction complex. A quantitative analysis of intron effects on mammalian gene expression. Obernosterer, G. Post-transcriptional regulation of microRNA expression. RNA 12, — Ohno, M. Identity elements used in export of mRNAs. Cell 9, — Okamura, K. Endogenous small interfering RNAs in animals. Palmiter, R.
Heterologous introns can enhance expression of transgenes in mice. Pan, Q. Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsense-mediated mRNA decay to control gene expression. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Parra, G. Comparative and functional analysis of intron-mediated enhancement signals reveals conserved features among plants.
Petit, M. Proudfoot, N. Ending the message: poly A signals then and now. Rearick, D. Critical association of ncRNA with introns.
Reis, E. As antisense RNA gets intronic. OMICS 9, 2— Rigo, F. Rodrigues, J. Rogozin, I. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Rose, A. Promoter-proximal introns in Arabidopsis thaliana are enriched in dispersed signals that elevate gene expression.
That is usually the first question asked about an organism's genome. Over the past 60 years, scientists have estimated the genome sizes of more than 10, plants, animals, and fungi. However, while information about an organism's genome size might seem like a good starting point for attempting to understand the genetic content, or "complexity," of the organism, this approach often belies the tremendous complexity of the eukaryotic genome. As Van Straalen and Roelofs explain, "There is a remarkable lack of correspondence between genome size and organism complexity, especially among eukaryotes.
For example, the marbled lungfish, Protopterus aethiopicus , has more than 40 times the amount of DNA per cell than humans! Indeed, the marbled lungfish has the largest recorded genome of any eukaryote.
One haploid copy of this fish's genome is composed of a whopping Genome size is usually measured in picograms [pg] and then converted to nucleotide number. One pg is equivalent to approximately 1 billion base pairs. Therefore, genome size is clearly not an indicator of the genomic or biological complexity of an organism. Otherwise, humans would have at least as much DNA as the marbled lungfish, although probably much more.
As further clarification, when scientists talk about the eukaryotic genome, they are usually referring to the haploid genome—this is the complete set of DNA in a single haploid nucleus , such as in a sperm or egg.
So, saying that the human genome is approximately 3 billion base pairs bp long is the same as saying that each set of chromosomes is 3 billion bp long. In fact, each of our diploid cells contains twice that amount of base pairs. Moreover, scientists are usually referring only to the DNA in a cell's nucleus, unless they state otherwise. All eukaryotic cells, however, also have mitochondrial genomes, and many additionally contain chloroplast genomes.
In humans, the mitochondrial genome has only about 16, nucleotide base pairs, a mere fraction of the length of the 3 billion bp nuclear genome Anderson et al. Interestingly, the same "remarkable lack of correspondence" can be noted when discussing the relationship between the number of protein-coding genes and organism complexity.
Scientists estimate that the human genome, for example, has about 20, to 25, protein-coding genes. Before completion of the draft sequence of the Human Genome Project in , scientists made bets as to how many genes were in the human genome.
Most predictions were between about 30, and , Nobody expected a figure as low as 20,, especially when compared to the number of protein-coding genes in an organism like Trichomonas vaginalis. This tiny organism features the largest number of protein-coding genes of any eukaryotic genome sequenced to date: approximately 60, In fact, compared to almost any other organism, humans' 25, protein-coding genes do not seem like many. The fruit fly Drosophila melanogaster , for example, has an estimated 13, protein-coding genes.
Or consider the mustard plant Arabidopsis thaliana , the "fruit fly" of the plant world, which scientists use as a model organism for studying plant genetics. Moreover, A. It would seem obvious that humans would have more protein-coding genes than plants, but that is not the case. These observations suggest that there is more to the genome than protein-coding genes alone. The number of protein-coding genes usually caps off at around 25, or so, even as genome size increases.
While the majority of emphasis has been placed on protein-coding genes in particular, scientists have continued to refine their definition of what exactly a gene is, partly in response to the realization that DNA encodes more than just proteins. Within this article, however, the discussion focuses on protein-coding genes, unless otherwise stated. While scientists have been measuring genome size for decades, they have only recently had the technological capacity and know-how to count genes.
To estimate the number of protein-coding genes in a genome, scientists often start by using what are known as gene-prediction programs: computational programs that align the sequence of interest with one or more known genome sequences. Other computer programs can predict gene location by looking for sequence characteristics of genes, such as open reading frames within exons and CpG islands within promoter regions.
However, all of these computer programs only predict the presence of genes. Each prediction must then be experimentally validated, such as by using microarray hybridization to confirm that the predicted genes are represented in RNA Yandell et al. As Michael Brent, a professor of computer engineering at Washington University, explained in Nature Biotechnology , gene prediction has become much more accurate over the past several years Brent, Its improved precision accounts for why estimates of the number of genes in the human genome have decreased from 45, about 10 years ago, to Venter et al.
In short, the older computational methods generated a lot of false positives, meaning that they predicted the presence of protein-coding genes that weren't actually there. As with genome size, having more protein-coding genes does not necessarily translate into greater complexity. This is because the eukaryotic genome has evolved other ways to generate biological complexity.
Much of this complexity derives from how the genome "behaves," or more precisely, how various genes are expressed. Alternative splicing was the first phenomenon scientists discovered that made them realize that genomic complexity cannot be judged by the number of protein-coding genes.
During alternative splicing, which occurs after transcription and before translation , introns are removed and exons are spliced together to make an mRNA molecule. However, the exons are not necessarily all spliced back together in the same way.
Thus, a single gene, or transcription unit , can code for multiple proteins or other gene products, depending on how the exons are spliced back together. In fact, scientists have estimated that there may be as many as , or more different human proteins, all coded by a mere 20, protein-coding genes.
Scientists have since come across several other mechanisms that contribute to the eukaryotic genome's capacity to generate phenotypic complexity. These include RNA editing , trans-splicing , and tandem chimerism. RNA editing is the alteration of an mRNA molecule after transcription—for example, the modification of a cytosine to a uracil before an mRNA molecule is translated into a protein.
The phenotypic consequences of RNA editing vary among genes and species. While sometimes detrimental e. Trans-splicing is the splicing together of separate transcripts to form an mRNA molecule, as opposed to alternative splicing, which is the splicing together of exons from the same transcript. Tandem chimerism occurs when adjacent transcription units are transcribed together to form a single "chimeric" mRNA molecule Parra et al.
Consider again those 60, protein-coding genes in Trichomonas vaginalis.
0コメント