version 3.6
DNAPARS -- DNA Parsimony Program
(C) Copyright 1986-2002 by The University of Washington. Written by
Joseph Felsenstein. Permission is granted to copy this document
provided that no fee is charged for it and that this copyright notice
is not removed.
This program carries out unrooted parsimony (analogous to Wagner
trees) (Eck and Dayhoff, 1966; Kluge and Farris, 1969) on DNA
sequences. The method of Fitch (1971) is used to count the number of
changes of base needed on a given tree. The assumptions of this method
are analogous to those of MIX:
1. Each site evolves independently.
2. Different lineages evolve independently.
3. The probability of a base substitution at a given site is small
over the lengths of time involved in a branch of the phylogeny.
4. The expected amounts of change in different branches of the
phylogeny do not vary by so much that two changes in a high-rate
branch are more probable than one change in a low-rate branch.
5. The expected amounts of change do not vary enough among sites that
two changes in one site are more probable than one change in
another.
That these are the assumptions of parsimony methods has been
documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b,
1983b, 1988b). For an opposing view arguing that the parsimony methods
make no substantive assumptions such as these, see the papers by
Farris (1983) and Sober (1983a, 1983b, 1988), but also read the
exchange between Felsenstein and Sober (1986).
Change from an occupied site to a deletion is counted as one change.
Reversion from a deletion to an occupied site is allowed and is also
counted as one change. Note that this in effect assumes that a
deletion N bases long is N separate events.
Dnapars can handle both bifurcating and multifurcating trees. In doing
its search for most parsimonious trees, it adds species not only by
creating new forks in the middle of existing branches, but it also
tries putting them at the end of new branches which are added to
existing forks. Thus it searches among both bifurcating and
multifurcating trees. If a branch in a tree does not have any
characters which might change in that branch in the most parsimonious
tree, it does not save that tree. Thus in any tree that results, a
branch exists only if some character has a most parsimonious
reconstruction that would involve change in that branch.
It also saves a number of trees tied for best (you can alter the
number it saves using the V option in the menu). When rearranging
trees, it tries rearrangements of all of the saved trees. This makes
the algorithm slower than earlier versions of Dnapars.
The input data is standard. The first line of the input file contains
the number of species and the number of sites.
Next come the species data. Each sequence starts on a new line, has a
ten-character species name that must be blank-filled to be of that
length, followed immediately by the species data in the one-letter
code. The sequences must either be in the "interleaved" or
"sequential" formats described in the Molecular Sequence Programs
document. The I option selects between them. The sequences can have
internal blanks in the sequence but there must be no extra blanks at
the end of the terminated line. Note that a blank is not a valid
symbol for a deletion.
The options are selected using an interactive menu. The menu looks
like this:
DNA parsimony algorithm, version 3.6a3
Setting for this run:
U Search for best tree? Yes
S Search option? More thorough search
V Number of trees to save? 100
J Randomize input order of sequences? No. Use input order
O Outgroup root? No, use as outgroup species 1
T Use Threshold parsimony? No, use ordinary parsimony
N Use Transversion parsimony? No, count all steps
W Sites weighted? No
M Analyze multiple data sets? No
I Input sequences interleaved? Yes
0 Terminal type (IBM PC, ANSI, none)? (none)
1 Print out the data at start of run No
2 Print indications of progress of run Yes
3 Print out tree Yes
4 Print out steps in each site No
5 Print sequences at all nodes of tree No
6 Write out trees onto tree file? Yes
-
Y to accept these or type the letter for one to change
-
TEST DATA SET
5 13
Alpha AACGUGGCCAAAU
Beta AAGGUCGCCAAAC
Gamma CAUUUCGUCACAA
Delta GGUAUUUCGGCCU
Epsilon GGGAUCUCGGCCC
_________________________________________________________________
CONTENTS OF OUTPUT FILE (if all numerical options are on)
DNA parsimony algorithm, version 3.6a3
5 species, 13 sites
Name Sequences
--- ---------
Alpha AACGUGGCCA AAU
Beta ..G..C.... ..C
Gamma C.UU.C.U.. C.A
Delta GGUA.UU.GG CC.
Epsilon GGGA.CU.GG CCC
One most parsimonious tree found:
+-----Epsilon
+----------------------------3
+------------2 +-------Delta
| |
| +----------------Gamma
|
1----Beta
|
+---------Alpha
requires a total of 19.000
between and length
------ --- ------
1 2 0.217949
2 3 0.487179
3 Epsilon 0.096154
3 Delta 0.134615
2 Gamma 0.275641
1 Beta 0.076923
1 Alpha 0.173077
steps in each site:
0 1 2 3 4 5 6 7 8 9
*-----------------------------------------
0| 2 1 3 2 0 2 1 1 1
10| 1 1 1 3
From To Any Steps? State at upper node
( . means same as in the node below it on tree)
1 AABGTCGCCA AAY
1 2 yes V.KD...... C..
2 3 yes GG.A..T.GG .C.
3 Epsilon maybe ..G....... ..C
3 Delta yes ..T..T.... ..T
2 Gamma yes C.TT...T.. ..A
1 Beta maybe ..G....... ..C
1 Alpha yes ..C..G.... ..T
|