Original NSF Proposal
"From the genome to the tree of life"
NSF Proposal Body Bibliography Initial Core Participant's statements
1. Results from Prior Support 5. Examples: Research Integrating Genomics / Phylogenetics
2. Background: Phylogenetics / Evolution 6. Proposed Coordination Activities
3. Background: Genomics 7. Management / Coordination Mechanisms
4. Theme: Research Coordination Group 8. Significance

Section 2: Background on Green Plant Phylogenetics and Evolution

Available phylogenetic data.

Considerable morphological data have accumulated over the last three decades that bear on the question of phylogenetic relationships of the green plants (e.g., Stewart and Mattox, 1975; Hébant, 1977; Pickett-Heaps, 1979; Crandall-Stotler, 1980, 1981; Brown and Lemmon, 1988; Carothers and Rushing, 1988; Duckett and Renzaglia, 1988; Ligrone and Gambardella, 1988; Garbary and Renzaglia, 1998). Attempts have been made to synthesize cladistically this growing data base (Mishler and Churchill, 1984, 1985; Sluiman, 1985; Theriot, 1988; Graham et al., 1991; Garbary et al., 1993; Kenrick and Crane, 1997).


Over the last decade, comparative molecular data have become available as well (e.g., Kantz et al., 1990; Zechman et al., 1990; Lewis et al., 1992; Mishler et al., 1992; Waters et al., 1992; Wilcox et al., 1992; Hedderson et al., 1996, 1998; Chapman et al. 1998; Duff and Nickrent, 1999; Qiu et al., 1998; Kallersjo et al., 1998; P. Soltis et al., 1999a; Malek et al., 1996; Chaw et al., 2000). Molecular sequence data have shown considerable promise for phylogenetic analysis, but they provide no panacea (despite overly optimistic claims in the literature, e.g., Graur, 1993). In fact, theoretical considerations predict that DNA sequence characters (given their quasi-clocklike evolution and limited number of character states) could be especially problematical in "deep" phylogenetic reconstructions, where considerable asymmetry in branch lengths exists (Felsenstein, 1978; Mishler et al., 1988; Albert et al., 1992; Donoghue and Sanderson, 1992; Albert et al., 1993; Mishler, 1994). Thus, great caution is required for phylogenetic inference at these deep levels. Long branch attraction (Felsenstein, 1978) appears to have had a dramatic effect on some topologies obtained among and within the major green plant lineages, for both rbcL and 18S rDNA data (Manhart, 1994; Bopp and Capesius, 1995; Kranz et al., 1995). Long branch attraction places groups not historically related nearby in a topology because of accumulated parallel changes (Felsenstein 1978). Factors leading to branch length heterogeneity include evolutionary rate heterogeneity among lineages, time differences, and inadequate taxon sampling (of extant taxa, or because of extinctions) such that nodes separating major lineages have a large number of substitutions supporting them.


These problems will be moderated by careful choice of appropriate characters for use at different levels, adequate taxon sampling, and application of proper methods of analysis. Careful evaluation of all potential characters is required; it is necessary to apply to molecular data basic principles of character analysis (for deriving strong, independent hypotheses of character homology) and cladistic analysis (for evaluating the phylogenetic "signal," if any, present in the resulting data set). Theoretical issues that must be faced in large-scale, synthetic analyses include further development of methods for: (1) combining/comparing data sets of fundamentally different natures (including issues of character and character-state weighting; Miyamoto, 1985; Kluge, 1989; Albert and Mishler, 1992; Albert et al., 1992; Donoghue and Sanderson, 1992; Albert et al., 1993); (2) assessing support for clades (e.g., bootstrap vs. the decay index; Mishler et al., 1991; Källersjö et al., 1992); and (3) representing diverse, yet clearly monophyletic, clades (e.g., the exemplar method vs. "compartmentalization" -- an approach involving substituting an inferred "archetype" or hypothetical ancestor for a clade accepted as monophyletic a priori in an inclusive analysis: Mishler, 1994; Mishler et al., 1998). One optimistic note is that it appears from our empirical experience that large data sets can be analyzed more easily than suspected (Soltis et al., 1998), which agrees with the simulation studies of Hillis (1996) and Graybeal (1998).

Current understanding of relationships.

Cladistic studies to date suggest that the green plants appear to be composed of two major lineages and a residuum of unicellular micromonadophytes (Fig. 1). One of these major lineages contains the bulk of the classical green algae (Chlorophyceae, Pleurastrophyceae, and Ulvophyceae sensu Mattox and Stewart, 1984). There is an indication that the ulvophytes are basal to the chlorophytes plus pleurastrophytes based on morphological and ultrastructural data (Stewart and Mattox, 1975; Mattox and Stewart, 1984; O'Kelly and Floyd, 1984; Sluiman, 1985; Zechman et al., 1990), that also found non-monophyly of the chlorophytes. The pleurastrophytes have been treated in three ways: (1) as a separate sister class to the Chlorophyceae (Mattox and Stewart, 1984); (2) as part of the Chlorophyceae (Melkonian, 1990); or (3) as part of the Ulvophyceae (Sluiman 1989). The other major lineage of green plants includes the charophycean green algae plus the land plants (i.e., bryophytes plus tracheophytes). With morphological data the genus Coleochaete (or even some part of it alone) appeared to be the closest extant sister group of land plants in Graham, et al.(1991), although there are conflicting molecular results. Although various studies world-wide involving molecular or combined molecular and organismal data confirm that the sister group to the land plants is among the charophycean green algae, the specific sister group has still not been robustly determined (An et al., 1999; Bhattacharya and Medlin, 1998; Bhattacharya et al. 1994, 1996a, 1996b, 1998; Friedl, 1997; Graham, 1996; Huss and Kranz, 1997; Kranz et al., 1995; McCourt et al., 1995; McCourt et al., 1996a, 1996b; Melkonian and Surek, 1995).


Within the land plants, the bryophytes are composed of three distinctive lineages (i.e., liverworts, hornworts, mosses), whose relationships to the tracheophytes are controversial. Morphological data provide evidence that the bryophytes are paraphyletic (Crandall-Stotler 1980, Mishler and Churchill 1984, 1985, Sluiman 1985, Kendrick and Crane 1991, Mishler et al. 1994, but see Garbary et al. 1993), and support the liverworts as the most basal lineage of the three (Mishler and Churchill 1984, 1985, Graham et al. 1992, Mishler et al. 1994). In addition to the morphological evidence, the oldest bryophyte fossils are liverworts from Devonian-aged sediments (Schuster 1984, Stewart and Rothwell 1993). Molecular data have been equivocal, supporting a number of conflicting branching orders, in part due to poor taxon sampling or limited sequence lengths (Mishler et al. 1992, Waters et al. 1992, Manhart 1994, Mishler et al. 1994, Bopp and Capesius 1995, Kranz et al. 1995; Renzaglia et al., in press; Lewis et al., 1997).


Within the tracheophytes, the lycophytes are sister to all other tracheophytes (e.g., Raubeson and Jansen, 1992; Kenrick and Crane, 1997), a result that has been supported by analyses of both morphological and DNA sequence data. A comprehensive analysis of morphological and molecular characters in basal tracheophytes (Pryer et al., unpubl.) produced the topology: [lycophytes [[[Psilotum + eusporangiate ferns] + [Equisetum + ferns]] + seed plants]], supporting the sister relationship of Psilotum and eusporangiate ferns reported previously (e.g., Manhart, 1994; Hasebe et al., 1995; Wolf, 1997; Wolf et al., 1998). Within the seed plants, morphological and molecular data generally provide conflicting topologies for the five lineages of extant seed plants (cycads, Ginkgo, conifers, Gnetales, and angiosperms). Most morphological analyses support the "anthophyte hypothesis," with Gnetales, angiosperms, and two groups of extinct seed plants, all with flowers or flower-like structures, sharing a common ancestor not shared with other seed plants (e.g., Crane, 1985; Doyle and Donoghue, 1986; Rothwell and Serbet, 1994; Nixon et al., 1994; reviewed in Doyle, 1996, 1998a, 1998b). Molecular data, in contrast, show a diversity of topologies, with analyses of rbcL (Manhart, 1994; Chase et al., 1993; but not Hasebe et al., 1992 or Kallersjo et al., 1998) and nuclear LSU sequences (Stefanovic et al., 1998; Ross et al., 1999) supporting the anthophyte hypothesis but a large number of studies supporting a monophyletic gymnosperm clade and a sister relationship between Gnetales and conifers (e.g., Chaw et al., 1997; Hedderson et al., 1996, 1998; Goremykin et al., 1996; Winter et al., 1999; Hansen et al., 1999; P. Soltis et al., 1999a) or even a sister relationship of Gnetales and Pinaceae (Chaw et al., 2000; Bowe et al., submitted)! Other studies have also reported gymnosperm monophyly (e.g., analyses of mitochondrial 19S rDNA by Duff and Nickrent, 1999; analyses of the chloroplast genes psaA and psbB by Sanderson et al., 2000), but relationships among lineages are not clear. Further work is clearly needed to resolve relationships among lineages of seed plants and to understand the basis of the conflict between morphological and molecular data.


Although the sister group of the angiosperms remains uncertain, the root and major phylogenetic structure within the angiosperms now seems clear. The first three branches of extant angiosperms are Amborella, Nymphaeales (water lilies), and a clade of Austrobaileya, Trimenia, and Illiciales (P. Soltis et al., 1999a; Qiu et al., 1999; Mathews and Donoghue, 1999; Parkinson et al., 1999; D. Soltis et al., 2000). A radiation of several clades of ancient angiosperms, including the monocots, follows these basal branches, and approximately 75% of all angiosperms fall into a single clade, the eudicots, which itself comprises three major clades and several smaller ones (P. Soltis et al., 1999b; D. Soltis et al., 2000). Despite some remaining areas of uncertainty, relationships within the angiosperms are now sufficiently well understood to permit interpretations of evolutionary history for certain genes, structures, and processes.

Fig. 1. A summation of the currently hypothesized cladistic relationships of green plants, along with an
indication of places in the phylogeny where uncertainty is greatest (stippled).

Tentative evolutionary inferences.

The success of the Deep Green effort to date has in turn generated exciting new opportunities for both applied and basic research. The more robust parts of the current cladogram, though clearly in need of support from future studies sampling more species and more character systems (morphological as well as molecular), can serve as a framework for evolutionary interpretations. It appears reasonably well supported, for example, that multicellularity arose at least twice in the green plants. The diversification of life-history strategies is becoming clearer; from a primitively haplontic life cycle, alternation of generations and diploid-dominant life-cycles arose at least twice each. The habitat transition in the movement of plants to land was from fresh water, not from salt water. Within the land plants, several morphological transformations can be reasonably postulated at present, such as the origin of branched, multisporangiate plants from unbranched, unisporangiate ones, and the radiation of types of conducting cells (Kenrick and Crane, 1991).


back to Deep Gene home |
Announcements | News | Original NSF Proposal | Previous Meeting
Governance | Deep Green | links | Webmaster |