DendroPy 4 Changes and Migration Primer¶

Introduction¶

Updated for full (and exclusive) Python 3.x compatibility.
Faster, better, stronger! Core serialization/deserialization infrastructure rewritten from the ground up, with many optimizations for speed and reliability.

Python Version Compatibility¶

Compatibility: Python 3 is fully supported. The only version of Python 2 supported is Python 2.7.
- Python 2: Python 2.7
- Python 3: Python 3.1, 3.2, 3.3, 3.4

Library-Wide Changes¶

Public Module Reorganization¶

A number of modules have been renamed, moved, or split into multiple modules. Calls to the old module should continue to work, albeit with warnings exhorting that you update to the latest configuration.

dendropy.treecalc has been split into three submodules depending on whether the statistic or value being calculated is on a single tree, a single tree and a dataset, or two trees:

dendropy.calculate.treemeasure For calculation of statistics, metrics, and values on a single tree.

dendropy.calculate.treecompare For calculation of statistics, metrics, and values of two trees (e.g., Robinson-Fould’s distances).

dendropy.calculate.treescore For calculation of statistics, metrics, and values of a tree with reference to a dataset under some criterion.

The functionality provided dendropy.treesplit has been largely subsumed by the new Bipartition class.

The functionality provided by dendropy.treesum has been largely subsumed by the new TreeArray class, a high-performance class for efficiently managing and carrying out operations on large collections of large trees.

dendropy.reconcile has been moved to dendropy.model.reconcile.

dendropy.coalescent has been moved to dendropy.model.coalescent.

dendropy.popgenstat has been moved to dendropy.calculate.popgenstat.

dendropy.treesim has been moved to dendropy.simulate.treesim.

dendropy.popgensim has been moved to dendropy.simulate.popgensim.

Behind-the-Scenes Module Reorganization¶

In constrast to the above, the following changes should be opaque to most normal usage and client code. Most of the names (classes/methods/variables) in these modules were imported into the ‘dendropy’ namespace, and this is how all public code should be accessing them, or they were never exposed (or meant to be exposed) for public usage in the first place. A list of module changes:

DendroPy 3 DendroPy 4

dendropy.dataobject.base dendropy.datamodel.basemodel

dendropy.dataobject.taxon dendropy.datamodel.taxonmodel

dendropy.dataobject.tree dendropy.datamodel.treemodel dendropy.datamodel.treecollectionmodel

dendropy.dataobject.char dendropy.datamodel.charstatemodel, dendropy.datamodel.charmatrixmodel

Unique Object Identifier (“`oid`”) Attributes Removed¶

The entire oid system (“object identifier”), i.e., the unique id assigned to every data object, has been removed. This was an implementation artifact from NEXML parsing that greatly slowed down a number of operations without any benefit or utility for most normal operations.

`TaxonSet` is now `TaxonNamespace`¶

The dendropy.TaxonSet class has been renamed TaxonNamespace, (and the corresponding taxon_set attribute of phylogenetic data objects that reference a taxonomic context has been renamed taxon_namespace).
The TaxonNamespace class replaces the TaxonSet class as the manager for the Taxon objects.
The API is largely similar with the following differences:
- Calls to the __getitem__ and __delitem__ methods (e.g. TaxonNamespace[x]) now only accept integer values as arguments (representing indexes into the list of Taxon objects in the internal array).
- TaxonSet.has_taxon and TaxonSet.has_taxa have been
  
  replaced by TaxonNamespace.has_taxon_label and TaxonNamespace.has_taxa_labels respectively.
- Various new methods for accessing and managing the collection of
  
  Taxon objects (e.g., findall, remove_taxon, remove_taxon_label, discard_taxon_label, __delitem__, etc.)
- Numerous look-up methods took ‘case_insensitive’ as an argument that determined whether the look-up was case sensitive or not (when retrieving, for example, a Taxon object corresponding to a particular label), which, if not specified, default to False, i.e. a non-caseless or a case-sensitive matching criteria. In all cases, this has been changed to to ‘case_sensitive’ with a default of True. That is, searches by default are still case-sensitive, but now you will have to specify ‘case_sensitive=False’ instead of ‘case_insensitive=True’ to perform a case-insensitive search. This change was for consistency with the rest of the library.
In most cases, a simple global search-and-replace of “TaxonSet” with “TaxonNamespace” and “taxon_set” with “taxon_namespace” should be sufficient to bring existing code into line with DendroPy 4.
For legacy support, a class called TaxonSet exists. This derives with no modifications from TaxonNamespace. Instantiating objects of this class will result in warnings being emitted. As long as usage of TaxonSet does conforms to the above API change notes, old or legacy code should continue to work unchanged (albeit, with some warning noise). This support is temporary and will be removed in upcoming releases: code should update to using TaxonNamespace as soon as expedient.
For legacy support, “taxon_set” continues to be accepted and processed as an attribute name and keyword argument synonymous with “taxon_namespace”. Usage of this will result in warnings being emitted, but code should continue to function as expected. This support is temporary and will be removed in upcoming releases: code should update to using “taxon_namespace” as soon as expedient.

The `Node` Class¶

Constructor now only accepts keyword arguments (and oid is not one of them!).
add_child no longer accepts pos as an argument to indicate position in which a child should be inserted. Use insert_child which takes a position specified by index and a node specified by node for this functionality instead.

The `Edge` Class¶

Constructor now only accepts keyword arguments (and oid is not one of them!).
Because tail_node is no longer an independent attribute but a dynamic property, bound to Node._parent_node attribute of the head_node (see below), the Edge constructor does not accept tail_node as an argument.
The tail_node of an Edge object is now a dynamic property, referencing the Node._parent_node attribute of the Edge._head_node of the Edge object. So, now updating Edge._tail_node of an Edge object will set the Node._parent_node of its Edge._head_node to the new value, and vice versa. This avoids the need for independent book-keeping logic to ensure that Node._parent_node and Edge._tail_node are always synchronized to reference the same Node object and all the potential errors this might cause.

The `Tree` Class¶

Constructor no longer supports they stream keyword argument to construct the new Tree object from a data source. Use the factory class method: get_from_stream instead.
nodes : sorting option removed; use nodes()) instead.
node_set : removed; use nodes()) instead.
edge_set : removed; use edges()) instead.

For consistency with preorder_node_iter, postorder_node_iter, a number of iteration methods have been renamed.

DendroPy 3 DendroPy 4

Tree.level_order_node_iter() levelorder_node_iter

Tree.level_order_edge_iter() levelorder_edge_iter

Node.level_order_iter() levelorder_iter

Tree.age_order_node_iter() ageorder_node_iter

Node.age_order_iter() ageorder_iter

Tree.leaf_iter() leaf_node_iter

The old names are still supported for now (with warnings being emitted), but new code should start using the newer names. In additon, support for in-order or infix tree traversal has been added: inorder_node_iter, inorder_edge_iter.

Instead of tree_source_iter and multi_tree_source_iter, use yield_from_files

NEWICK-format Reading¶

The suppress_external_taxon_labels and suppress_external_node_labels keyword arguments have been replaced by suppress_leaf_taxon_labels and suppress_leaf_node_labels, respectively. This is for consistency with the rest of the library (including writing in NEWICK-format), which uses the term “leaf” rather than “external”.
The various boolean rooting directive switches (as_rooted, default_as_rooted, etc.) have been replaced by a single argument: rooting. This can take on one of the following (string) values:
- rooting=”default-unrooted”
  
  Interpret trees following rooting token (“[&R]” for rooted, “[&U]” for unrooted) if present; otherwise, intrepret trees as unrooted.
- rooting”default-rooted”
  
  Interpret trees following rooting token (“[&R]” for rooted, “[&U]” for unrooted) if present; otherwise, intrepret trees as rooted.
- rooting=”force-unrooted”
  
  Unconditionally interpret all trees as unrooted.
- rooting=”force-rooted”
  
  Unconditionally interpret all trees as rooted.
The value of the “rooting” argument defaults to “default-unrooted”, i.e., all trees are assumed to be unrooted unless a rooting token is present that explicitly specifies the rooting state.

NEWICK-format Writing¶

Previously, if annotations_as_nhx was True, metadata annotations would be written out even if suppress_annotations was True. Now, suppress_annotations must be True for annotations to be written out, even if annotations_as_nhx is True.

The `DataSet` Class¶

Constructor no longer supports they stream keyword argument to construct the new DataSet object from a data source. Use the factory class method: DataSet.get_from_stream instead.
Constructor only accepts one unnamed (positional) argument: either a DataSet instance to be cloned, or an iterable of TaxonNamespace, TreeList, or CharacterMatrix-derived instances to be composed (added) into the new DataSet instance.
TaxonNamespace no longer managed.

DendroPy 4 Changes and Migration Primer¶

Introduction¶

Python Version Compatibility¶

Library-Wide Changes¶

Public Module Reorganization¶

Behind-the-Scenes Module Reorganization¶

Unique Object Identifier (“`oid`”) Attributes Removed¶

`TaxonSet` is now `TaxonNamespace`¶

The `Node` Class¶

The `Edge` Class¶

The `Tree` Class¶

NEWICK-format Reading¶

NEWICK-format Writing¶

The `DataSet` Class¶

Table of Contents

Previous topic

Next topic

Documentation

Obtaining

Discussion

Announcements

Development

DendroPy 3	DendroPy 4
`dendropy.dataobject.base`	`dendropy.datamodel.basemodel`
`dendropy.dataobject.taxon`	`dendropy.datamodel.taxonmodel`
`dendropy.dataobject.tree`	`dendropy.datamodel.treemodel` `dendropy.datamodel.treecollectionmodel`
`dendropy.dataobject.char`	`dendropy.datamodel.charstatemodel`, `dendropy.datamodel.charmatrixmodel`

DendroPy 3	DendroPy 4
`Tree.level_order_node_iter()`	`levelorder_node_iter`
`Tree.level_order_edge_iter()`	`levelorder_edge_iter`
`Node.level_order_iter()`	`levelorder_iter`
`Tree.age_order_node_iter()`	`ageorder_node_iter`
`Node.age_order_iter()`	`ageorder_iter`
`Tree.leaf_iter()`	`leaf_node_iter`

DendroPy 4 Changes and Migration Primer¶

Introduction¶

Python Version Compatibility¶

Library-Wide Changes¶

Public Module Reorganization¶

Behind-the-Scenes Module Reorganization¶

Unique Object Identifier (“oid”) Attributes Removed¶

TaxonSet is now TaxonNamespace¶

The Node Class¶

The Edge Class¶

The Tree Class¶

NEWICK-format Reading¶

NEWICK-format Writing¶

The DataSet Class¶

Unique Object Identifier (“`oid`”) Attributes Removed¶

`TaxonSet` is now `TaxonNamespace`¶

The `Node` Class¶

The `Edge` Class¶

The `Tree` Class¶

The `DataSet` Class¶