NSF Proposal - 1. Project Summary

The tree of life is inherently fractal. Look closely at one lineage of a phylogeny and it
dissolves into many separate lineages, and so on down to a very fine scale. There is now a great
body of phylogenetic research that has provided numerous tools applicable at particular, usually
fairly constrained, scales. These tools have left many phylogenetic questions unanswered. We
think these questions will remain unanswered until it is possible to do analyses across multiple

We believe that the green plant lineage represents the most suitable system for such
research. It is one of the oldest and most diverse branches of the tree of life, and it contains good
examples of the known phylogenetic problems. Investigations on it may draw on a tradition of
interdisciplinary collaborative research, facilitated by the Green Plant Phylogeny Research
Coordination Group (GPPRCG or "Deep Green").

Many interesting questions remain to be tested in the green plants, once a better
resolved phylogeny is available, such as: How many times was land colonized from the water by
"green algae?" Where did the key adaptive features for life on land come from? How many times
has multicellularity arisen in the green plants? Did multicellularity ever reverse? How many times
did alternation of generations and diploid-dominant life-cycles arise? How has tempo and mode
of macroevolution changed during diversification?

One could take two different approaches to broad phylogenetic studies such as this,
either developing data sets with relatively few exemplars, but a very large number of comparable
characters, or data sets with many exemplars but a smaller set of comparable characters. Both
approaches have advantages, and both have their advocates. The two are not mutually
exclusive: the compartmentalization approach taken here uniquely allows both approaches to be
followed. A backbone phylogeny will be developed with a global data set and then local
phylogenies with many more OTUs, but fewer and different characters, will be connected in.

Our overall objective for the work proposed here is resolve the primary pattern of
evolutionary diversification among green plants and establish a model for doing so that will be
applicable to other groups of organisms with long evolutionary histories. A solid backbone based
on genomic and ultrastructural data for relatively few taxa will enable the integration of previous
and ongoing studies of many more taxa into a comprehensive picture of green plant phylogeny.

To achieve this objective, we will:

* complete a matrix of whole genome sequences for chloroplasts and mitochondria and develop
Bacterial Artificial Chromosome (BAC) nuclear genome libraries (where feasible given genome
size) for ca. 50 representatives of the critical deep-branching lineages of green plants.

* produce a comprehensive set of comparable morphological and ultrastructural data for these
same taxa;

* incorporate inferences from across the phylogenetic hierarchy in green plants using methods
designed to permit scaling across studies.

We shall indicate how this work will link to other research being conducted on green
plants at various scales, especially the concatenation of our datasets with theirs. We shall
propose training, education, and outreach strategies by which the activities of our group, and the
progress and results of our research, will be distributed to the scientific community and beyond.