Large-analyses
From TNT
The definition of "large" in phylogentic analysis is constantly changing. While in 1993, Zilla (with 500 terminals and 800 characters) was considered a huge analysis, now it is possible to run analyses orders of magnitude larger (73.000 terminals and 10.000 characters,( Goloboff et al., 2009). Although with modifications, the basic algorithms to run such matrices are almost the same as those necessary to run matrices of 200 or 300 taxa. Hence in this section we will try to make an introduction to the algorithms and methods to deal with large matrices from lets say 200 to 100.000 taxa.
The main characteristic of large analysis is the presence of lots of islands of most parsimonious trees. The effort should be directed, in these cases, at reaching many of these islands instead of having lots of trees from one or a few islands. As trees of different islands are generally more dissimilar than those of the same island, the consensus obtained including a few trees from different islands will be more similar to the correct consensus (that would be obtained if all most parsimonious trees were reached) than the consensus of lots of trees of one or two islands. Advice for TNT users: consider a high number of starting points while keeping only a few trees per replicate (3-4 or even less for very large matrices).
The presence of islands would also make it necessary to use algorithms to overcome this problem. So Tree Drifting and/or Ratchet should be used.
Another issue to consider is what Goloboff called "Composite optima". For trees with hundreds or thousands of terminals it is very probable that while swapping, a part of the tree may be in an optimal configuration but not the rest of the tree. An alternative tree may have a different sector of the tree in an optimal configuration. This is why Sectorial searches and Tree fusing are extremely important in the analysis of such large matrices.

