Variable length sequences are frequently used in phylogenetic analyses and often they contain problematic regions of high variability. Several methods are available for aligning sequences and these differ greatly in methodology and underlying philosophy. We will explore whether this fundamental step in the process of phylogeny building is being adequately handled and reported in the literature and discuss options available and limitations of the methods. The evening's discussion topic “Alignment of Sequence data” will be a panel consisting of individuals from UCB, UCD, CAS and CDFA and moderated by Kip Will (UCB).

Sections of the panel will introduce and lead discussion on the following points:

1. Summary data on what methods are actually being reported in the literature.

2. Does base homology actually matter or is it really just phylogeny we are after?

3. What are the limitations, assumption and implications of typically used alignment procedures, specifically various “automated" methods, e.g. Clustal.

4. What are the limitations, assumption and implications of by-eye and secondary structure guided methods?

5. What are the limitations, assumption and implications of non-alignment methods or implied-alignment methods, i.e. POY and specifically likelihood-based implied-alignment an procedures?

6. How can/should we deal with difficult data, e.g. should we throw out "un-alignable" data or keep all data, or something in between? What method/criteria do we use to decide what stays?

7. Is there a problem? Are our alignment methods sufficient and/or not impacting the resultant phylogenetic hypotheses?

 

Here is a list of useful background readings---

**Multiple sequence alignment methods/programs:

1. Notredame C. 2002. Recent progress in multiple sequence alignment: a survey. Pharmacogenomics, 3:131-144.

2. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22: 4673-4680.

**Secondary structure:

3. Kjer, K. 2004. Aligned 18s and Insect Phylogeny. Systematic Biology. 53.3.

**Data exclusion:

4. Gatesy, J., DeSalle, R. and Wheeler, W. 1993. Alignment-ambiguous nucleotide sites and the exclusion of systematic data. MP&E. 6:649.

**Implied-Alignment (aka Dynamic-homology):

5. Wheeler, W.C. 2003. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search. Cladistics 19:261-268.

**Likelihood-based alignment:

6. Thorne, Kishino, and Felsenstein, 1991. "An evolutionary model for maximum likelihood alignment of DNA sequences." JMolEvol 33(2): 114-124

**Homology:

7. Titus, T. and D. Frost. 1996. Molecular Homology Assessment and Phylogeny in the Lizard Family Opluridae (Squamata: Iguania). Molecular Phylogenetics and Evolution. 6.(1):49–62.

**Sensitivity analysis:

8. Wheeler, W. C. 1995. Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Systematic Biology 44: 321-331. 9. Schulmeister, S., W. Wheeler, and J. Carpenter. 2002. Simultaneous analysis of the basal lineages of Hymenoptera (Insecta) using sensitivity analysis. Cladistics 18: 455-484.