What are the best parameters and algorithms to perfom a search for datasets with 10, 50 and 100 species

From TNT

Jump to: navigation, search

In the case of non-exact (heuristic) level problems there is no such thing as the "best" parameters. However general considerations can be made:

Any matrix below approximately 24 terminals can be analyzed by a exact search (Analyze/Implicit enumeration). In this case it doesn't matter what the parameters are since you are finding an exact solution. If you have more than 24 terminals then you must use a heuristic, and in this case it is important to remember that you will never truly know if you have found the most parsimonious trees, your result is just an (apparently) optimal set of trees.

For larger matrices and up to 50-70 taxa optimal trees are generally found using several starting points (Random addition sequences -Wagner trees) followed by branch swapping (TBR and SPR).

A possible search may include starting by 100 Random Addition Sequences (RAS) followed by TBR and keeping 10 trees in each replication. Before starting the search be sure that you will have space for 1000 trees (100 RAS per a maximum of 10 trees per replication). This is changed in Settings/Memory or with the commnad hold The commands to perform this search are:

  hold 1000 ;
  mult 100 =tbr ;

Searching through complex matrices of this size or larger (up to 100 taxa) is more efficiently done using additional heuristics, for example Ratchet or Tree drifting. These algorithms were developed to accept suboptimal trees during the search that may finally conduct to better trees (they allow to solve the problem of islands of trees sensu Maddison 1991). To activate Ratchet or Tree Drifting you have to go to Analysis/New Technologies.

A possible search might be:

  hold 1000 ;
  mult 30 =tbr drift ;

In this exaple TNT will start from 30 RAS plus TBR followed by Tree drifting. Tree drifting options can be changed using the command drift

Personal tools