|"From the genome to the tree of life"|
|NSF Proposal Body||Bibliography||Initial Core Participant's statements|
|1.||Results from Prior Support||5.||Examples: Research Integrating Genomics / Phylogenetics|
|2.||Background: Phylogenetics / Evolution||6.||Proposed Coordination Activities|
|3.||Background: Genomics||7.||Management / Coordination Mechanisms|
|4.||Theme: Research Coordination Group||8.||Significance|
Section 3: Background on Green Plant Genomics
The long-term goal of plant genomics is to identify, isolate, and determine the function of plant genes that are associated with both vegetative and reproductive phenotypes. Most phenotypes require the coordinated activity and regulatory control of suites of genes over time and in precise positions within the plant. Until recently the idea of establishing a comprehensive approach to isolate and characterize all the genes involved in any complex phenotype was a daunting one, however, advances in genomics, informatics, and phylogenetics has brought such a prospect to a manageable level. The nucleotide sequence of the Arabidopsis genome is nearing completion, the sequencing of rice has begun, and large amounts of expressed sequence tag (EST) information is being obtained for many other plants. There are many new opportunities to use this wealth of information to accelerate progress toward an understanding of the genetic mechanisms that control plant growth and development and responses to the biotic and abiotic environment.
Progress in green plant genomics.
One of the first eukaryotic genomes to be completely sequenced will be that of the small mustard species Arabidopsis thaliana. During the past decade, Arabidopsis has emerged as one of the most widely used model organisms for studying the biology of higher plants. Its genome was chosen for sequencing because it is highly compact, about 130 Mb, with little interspersed repetitive DNA. However, since Arabidopsis is rather distantly related to the cereal crops that provide the bulk of the world food supply, the genome of rice will also be sequenced during the next decade. Rice was chosen because, in addition to its importance as a food source for about one quarter of the human population, it has one of the most compact genomes among the cereals. It contains about 3.5 times as much DNA as Arabidopsis but only about 20% as much DNA as maize and about 3% as much DNA as wheat (Bennett and Smith, 1991). However, the genome organization of the cereals appears to be very highly conserved; rice, wheat, maize, sorghum, millet and other cereals exhibit a high degree of synteny (Gale and Devos, 1998). The differences in genome size primarily reflect the amplification of interspersed repetitive sequences (Bennetzen et al., 1998); there is no evidence that angiosperms with large amounts of DNA per cell have substantially greater numbers of functional genes than angiosperms with relatively small DNA contents. Because of extensive synteny among the cereal genomes, knowledge of gene order and organization in rice may be used to isolate and characterize the corresponding genes in the other cereals (McCouch, 1998). Thus, for instance, if a genetic locus encoding a useful trait is mapped between a pair of closely linked molecular markers in wheat, it may be possible to identify candidate genes for the rice ortholog by analyzing the rice genome sequence located between the rice orthologs of the molecular markers.
Assigning function to genes.
One of the major efficiencies that has emerged from the plant genome research to date is that about 54% of Arabidopsis genes can be assigned some degree of function by comparison to the sequences of genes of known function (EU Arabidopsis Genome Project, 1998). In effect, a universal biology has coalesced from the common language of gene and protein sequences. Unfortunately, knowing the general function frequently does not provide an insight into the specific role in the organism. For instance, on the basis of sequence analysis, about 13% of Arabidopsis genes are inferred to be involved in transcription or signal transduction. However, knowing that a gene encodes a kinase or transcription factor does not provide any useful information about what processes are controlled by these genes. Thus, the completion of the genome sequences of Arabidopsis and rice will be followed by a second phase of large-scale functional genomics in which all of the approximately 20 - 25,000 genes that comprise the basic angiosperm genome will be assigned function on the basis of experimental evidence. Considering that the combined efforts of the plant biology community have resulted in the direct functional analysis of only about 1000 genes to date (Rounsley, 1996), this may seem like a tall order. However, it seems likely that the efficiency gained by reverse genetics will fundamentally change this equation. Large collections of insertion mutants are available for Arabidopsis, maize, petunia, and snapdragon, and collections of insertion mutants will probably be created in several other species, including rice. These collections can be screened for an insertional inactivation of any gene by using the polymerase chain reaction (PCR) primed with oligonucleotides based on the sequences of the target gene and the insertional mutagen (Martienssen, 1998a). The presence of an insertion in the target gene is indicated by the presence of a PCR product. By multiplexing DNA samples, hundreds of thousands of lines can be screened and the corresponding mutant plants identified with relatively small effort. In addition, several groups are embarking on the sequencing of the genomic DNA flanking a large numbers of insertions so that an insertion in virtually any gene can be identified by a computer search (Bouchez and Hafte, 1998). Analysis of the phenotype and other properties of the corresponding mutant will frequently provide an insight into the function of the gene.
Impact of gene chips and microarrays.
One of the most important experimental approaches for discovering the function of genes promises to be gene chips and microarrays. In principle, DNA sequences representing all of the genes in an organism can be placed on miniature solid supports and used as hybridization substrates to quantitate the expression of all of the genes represented in a complex mRNA sample (Schena et al., 1996). Thus, we may expect to have extensive databases of quantitative information about the degree to which each gene responds to pathogens, pests, drought, cold, salt, photoperiod, and other environmental variation. Similarly, we will have extensive information about which genes respond to changes in developmental processes such as germination and flowering, or to the phytohormones, growth regulators, safeners, herbicides, and related agrichemicals. Knowledge of which genes exhibit changes in expression in any mutant of interest will be useful for formulating hypotheses about the roles of the gene affected by the mutation (Holstete, et al., 1998).
to Deep Gene home |
Announcements | News | Original NSF Proposal | Previous Meeting
Governance | Deep Green | links | Webmaster |