Implied weighting
From TNT
Command: piwe
Contents |
Goloboff's function
TNT implements the implied weighting method of Goloboff (1993), using floating-point (=exact) fit calculations. The fit for characters is (by default) calculated as
f = k / ( e + k )
(where e = extra steps, and k = constant of concavity).
Example: setting implied weights with k = 3.
Path: Settings/Implied weighting
Beware of the check on "using implied weights".
Example: Asking ON/OFF status of implied weights.
Path: Settings
User defined functions
It is possible to define a user weighting function, by specifying the values of weights for different numbers of extra steps, relative to no extra steps. For this, the data must be read with implied weighting ON. The only requirement for user defined weighting functions is that no (relative) weight is negative. Weighting functions were the reliability increases with the first extra steps, and then decreases, are (from the numerical point of view) perfectly valid. To define a weighting function, type (or include in a file to be parsed) "piwe[" followed by as many numbers as extra steps you can have in your data set. The numbers are read as the relative costs of transforming from 0 to 1, 1 to 2, etc. If the relative weights defined are less than the maximum possible extra steps, the undefined transformations are given a weight equal to that of the last transformation defined. If more transformations than the maximum possible extra steps are defined, they are simply ignored. The user defined weighting functions are applicable only to discrete, non-Sankoff characters. When user-defined weights are in effect, the fit cannot be calculated (i.e. only the score as an increasing function of the homoplasy is reported). There is no menu option for this characteristic.
Example: Setting implied weights for clique analysis.
There is the manual equivalent function of k = 3. The first four values of fit are:
with e = 0; Fit = 3 / 3 = 1.00
e = 1; Fit = 3 / 4 = 0.75
e = 2; Fit = 3 / 5 = 0.60
e = 3; Fit = 3 / 6 = 0.50
The difference of each extra step is
dif. 0-1: 1.00 – 0.75 = 0.25
dif. 1-2: 0.75 – 0.60 = 0.15
dif. 2-3: 0.60 – 0.50 = 0.10
Example: Setting user function equivalent to k = 3.
Example: Asking the function values for each extra step (relative values).
Here is the visual result of this examples:
Note , that the value for 4 extra steps is identical to the value of 3 in the user defined function.
Fit
During the searches, the program reports the score as an increasing function of the homoplasy (or "distortion" to be minimized), which is
1 – f = e / ( e + k ).
Using the command fit, TNT shows the distortion of the tree. Use fit* to see the values as they are calculated by Goloboff's function (i.e. as reported by Pee-Wee or PAUP*). fit* is not valid for user defined functions! Here are the results on an hypothetical matrix using the defined function and the user function of the examples.
Example: Fit values under Goloboff's function.
Example: Fit values under user's function.
You may ask, why the user defined function produces a different value than the k = 3, if both are equal? This is because for Goloboff's function, TNT uses actual values of the function, whereas for the user defined function it uses the values scaled. If you divide the fit value of k = 3 (2.40) by the value of a single extra step (0.25), both results become equal.
The fit for discrete additive characters (including those with character-state-trees) is calculated by decomposing the character into binary variables. Note that this may produce fits different from those of Pee-Wee or PAUP* (which do not do this conversion); the fits will always be measured as higher in TNT; it has been suggested (De Laet, 1994) that the recoding produces a more meaningful evaluation of the relative weights. This affects only data sets with discrete additive characters. If the same characters are defined as continuous, the fit is calculated without decomposing into binary variables (thus producing the same fit values that Pee-Wee or PAUP* produce for the additive coding); treating the additive characters as continuous obviously slows calculations.
Since floating points calculations are used, it is very important that you evaluate group support when doing implied weighting. Since exact ties are very unlikely, this criterion may produce overresolved trees if poorly supported groups are not collapsed.
In the case of extra internal steps defined by the user, the same number of extra steps will be added for each of the variables into which the character is decomposed. If you want some of the variables to have different numbers of steps, you have to recode the character as binary yourself.
The method is applied to Sankoff characters as well, in which case the fit for a character with e extra steps and a minimum cost among states of c is calculated as
k / ( ( e/c ) + k )
When the costs are defined as all unity, the formula produces results identical to those of the standard formula used by Goloboff (1993). If the user has defined some characters has having extra internal steps, these are added to e.
Exploring the data
This is a simple script to explore the results under several k values. It assumes that the settings of a traditional searches (mult and bb commands) are already provided by the user. Then it just perform a loop of each k value (given the increase provided by the user, it must be an integer), search for a tree, and store the trees and the strict consensus. It also search for equal weights.
look results from different k values
Author: Salva.
Version: 1.0.
For: not provided.
Arguments: not provided.
Example: not provided.
Script
Scripts may not be displaying correctly here at present, use the script from the provide file in the archive.
macro = ;
var: minK maxK tempK kStep ;
if (argnumber < 3) set kStep 1 ; else set kStep %3 ; end
if (argnumber < 2) set maxK 50 ; else set maxK %2 ; end
if (argnumber < 1) set minK 1 ; else set minK %1 ; end
if ('minK' > 'maxK')
set tempK 'maxK' ;
set maxK 'minK' ;
set minK 'tempK' ;
end
if ('minK' == 0)
set minK 'kStep' ;
end
/* search on every k value */
loop 'minK' + 'kStep' 'maxK'
k 0;
Piwe = #1 ;
mult; bb;
tsave* treesk#1..tre; save; ts/;
nelsen*;
tsave* consk#1..tre; save {strict}; ts/;
stop
/* equal weights */
piwe -;
mult; bb;
tsave* treesEW.tre; save; ts/;
nelsen*;
tsave* consEW.tre; save {strict}; ts/;
proc/;
References
De Laet (1997). "A reconsideration of three-item analysis, the use of implied weights in cladistics, and a practical application in Gentianaceae". PhD thesis, Katholieke Universiteit Leuven. Available at http://www.anagallis.be
Goloboff (1993). "Estimating character weights during tree search" Cladistics, 9: 83-91







