dendropy.calculate.treecompare: Distances and Comparison Between Trees

Statistics, metrics, measurements, and values calculated between two trees.

dendropy.calculate.treecompare.euclidean_distance(tree1, tree2, edge_weight_attr='length', value_type=<type 'float'>, is_bipartitions_updated=False)

Returns the Euclidean distance (a.k.a. Felsenstein’s 2004 “branch length distance”) between two trees based on edge_weight_attr.

Trees need to share the same TaxonNamespace reference. The bipartition bitmasks of the trees must be correct for the current tree structures (by calling Tree.encode_bipartitions method) or the is_bipartitions_updated argument must be False to force recalculation of bipartitions.

Parameters:
  • tree1 (Tree object) – The first tree of the two trees being compared. This must share the same TaxonNamespace reference as tree2 and must have bipartitions encoded.
  • tree2 (Tree object) – The second tree of the two trees being compared. This must share the same TaxonNamespace reference as tree1 and must have bipartitions encoded.
  • edge_weight_attr (string) – Name of attribute on edges of trees to be used as the weight.
  • is_bipartitions_updated (bool) – If True, then the bipartitions on both trees will be updated before comparison. If False (default) then the bipartitions will only be calculated for a Tree object if they have not been calculated before, either explicitly or implicitly.
Returns:

d (int) – The Euclidean distance between tree1 and tree2.

Examples

import dendropy
from dendropy.calculate import treecompare
tns = dendropy.TaxonNamespace()
tree1 = tree.get_from_path(
        "t1.nex",
        "nexus",
        taxon_namespace=tns)
tree2 = tree.get_from_path(
        "t2.nex",
        "nexus",
        taxon_namespace=tns)
tree1.encode_bipartitions()
tree2.encode_bipartitions()
print(treecompare.euclidean_distance(tree1, tree2))
dendropy.calculate.treecompare.false_positives_and_negatives(reference_tree, comparison_tree, is_bipartitions_updated=False)

Counts and returns number of false positive bipar (bipartitions found in comparison_tree but not in reference_tree) and false negative bipartitions (bipartitions found in reference_tree but not in comparison_tree).

Trees need to share the same TaxonNamespace reference. The bipartition bitmasks of the trees must be correct for the current tree structures (by calling Tree.encode_bipartitions method) or the is_bipartitions_updated argument must be False to force recalculation of bipartitions.

Parameters:
  • reference_tree (Tree object) – The first tree of the two trees being compared. This must share the same TaxonNamespace reference as tree2 and must have bipartitions encoded.
  • comparison_tree (Tree object) – The second tree of the two trees being compared. This must share the same TaxonNamespace reference as tree1 and must have bipartitions encoded.
  • is_bipartitions_updated (bool) – If True, then the bipartitions on both trees will be updated before comparison. If False (default) then the bipartitions will only be calculated for a Tree object if they have not been calculated before, either explicitly or implicitly.
Returns:

t (tuple(int)) – A pair of integers, with first integer being the number of false positives and the second being the number of false negatives.

Examples

import dendropy
from dendropy.calculate import treecompare
tns = dendropy.TaxonNamespace()
tree1 = tree.get_from_path(
        "t1.nex",
        "nexus",
        taxon_namespace=tns)
tree2 = tree.get_from_path(
        "t2.nex",
        "nexus",
        taxon_namespace=tns)
tree1.encode_bipartitions()
tree2.encode_bipartitions()
print(treecompare.false_positives_and_negatives(tree1, tree2))
dendropy.calculate.treecompare.find_missing_bipartitions(reference_tree, comparison_tree, is_bipartitions_updated=False)

Returns a list of bipartitions that are in reference_tree, but not in comparison_tree.

Trees need to share the same TaxonNamespace reference. The bipartition bitmasks of the trees must be correct for the current tree structures (by calling Tree.encode_bipartitions method) or the is_bipartitions_updated argument must be False to force recalculation of bipartitions.

Parameters:
  • reference_tree (Tree object) – The first tree of the two trees being compared. This must share the same TaxonNamespace reference as tree2 and must have bipartitions encoded.
  • comparison_tree (Tree object) – The second tree of the two trees being compared. This must share the same TaxonNamespace reference as tree1 and must have bipartitions encoded.
  • is_bipartitions_updated (bool) – If True, then the bipartitions on both trees will be updated before comparison. If False (default) then the bipartitions will only be calculated for a Tree object if they have not been calculated before, either explicitly or implicitly.
Returns:

s (list[|Bipartition|]) – A list of bipartitions that are in the first tree but not in the second.

dendropy.calculate.treecompare.mason_gamer_kellogg_score(tree1, tree2, is_bipartitions_updated=False)

Mason-Gamer and Kellogg. Testing for phylogenetic conflict among molecular data sets in the tribe Triticeae (Gramineae). Systematic Biology (1996) vol. 45 (4) pp. 524

dendropy.calculate.treecompare.robinson_foulds_distance(tree1, tree2, edge_weight_attr='length')

DEPRECATED: Use :func:symmetric_difference for the common unweighged Robinson-Fould’s distance metric (i.e., the symmetric difference between two trees) :func:weighted_robinson_foulds_distance or for the RF distance as defined by Felsenstein, 2004.

dendropy.calculate.treecompare.symmetric_difference(tree1, tree2, is_bipartitions_updated=False)

Returns unweighted Robinson-Foulds distance between two trees.

Trees need to share the same TaxonNamespace reference. The bipartition bitmasks of the trees must be correct for the current tree structures (by calling Tree.encode_bipartitions method) or the is_bipartitions_updated argument must be False to force recalculation of bipartitions.

Parameters:
  • tree1 (Tree object) – The first tree of the two trees being compared. This must share the same TaxonNamespace reference as tree2 and must have bipartitions encoded.
  • tree2 (Tree object) – The second tree of the two trees being compared. This must share the same TaxonNamespace reference as tree1 and must have bipartitions encoded.
  • is_bipartitions_updated (bool) – If False, then the bipartitions on both trees will be updated before comparison. If True then the bipartitions will only be calculated for a Tree object if they have not been calculated before, either explicitly or implicitly.
Returns:

d (int) – The symmetric difference (a.k.a. the unweighted Robinson-Foulds distance) between tree1 and tree2.

Examples

import dendropy
from dendropy.calculate import treecompare
tns = dendropy.TaxonNamespace()
tree1 = tree.get_from_path(
        "t1.nex",
        "nexus",
        taxon_namespace=tns)
tree2 = tree.get_from_path(
        "t2.nex",
        "nexus",
        taxon_namespace=tns)
tree1.encode_bipartitions()
tree2.encode_bipartitions()
print(treecompare.symmetric_difference(tree1, tree2))
dendropy.calculate.treecompare.unweighted_robinson_foulds_distance(tree1, tree2, is_bipartitions_updated=False)

Alias for symmetric_difference().

dendropy.calculate.treecompare.weighted_robinson_foulds_distance(tree1, tree2, edge_weight_attr='length', is_bipartitions_updated=False)

Returns weighted Robinson-Foulds distance between two trees based on edge_weight_attr.

Trees need to share the same TaxonNamespace reference. The bipartition bitmasks of the trees must be correct for the current tree structures (by calling Tree.encode_bipartitions method) or the is_bipartitions_updated argument must be False to force recalculation of bipartitions.

Parameters:
  • tree1 (Tree object) – The first tree of the two trees being compared. This must share the same TaxonNamespace reference as tree2 and must have bipartitions encoded.
  • tree2 (Tree object) – The second tree of the two trees being compared. This must share the same TaxonNamespace reference as tree1 and must have bipartitions encoded.
  • edge_weight_attr (string) – Name of attribute on edges of trees to be used as the weight.
  • is_bipartitions_updated (bool) – If True, then the bipartitions on both trees will be updated before comparison. If False (default) then the bipartitions will only be calculated for a Tree object if they have not been calculated before, either explicitly or implicitly.
Returns:

d (float) – The edge-weighted Robinson-Foulds distance between tree1 and tree2.

Examples

import dendropy
from dendropy.calculate import treecompare
tns = dendropy.TaxonNamespace()
tree1 = tree.get_from_path(
        "t1.nex",
        "nexus",
        taxon_namespace=tns)
tree2 = tree.get_from_path(
        "t2.nex",
        "nexus",
        taxon_namespace=tns)
tree1.encode_bipartitions()
tree2.encode_bipartitions()
print(treecompare.weighted_robinson_foulds_distance(tree1, tree2))