Summary

phyloXML (example) is an XML language designed to describe phylogenetic trees (or networks) and associated data. PhyloXML provides elements for commonly used features, such as taxonomic information, gene names and identifiers, branch lengths, support values, and gene duplication and speciation events. Using these standardized elements allows interoperability between various applications and databases. Furthermore, both due to extensible nature of XML itself and the provision of <property> elements by phyloXML, extensibility as well as domain specific applications are ensured. The structure of phyloXML is described by XML Schema Definition (XSD) language.

Citations

Documentation

XML Schema Definition Location

http://www.phyloxml.org/1.10/phyloxml.xsd

Try it!

Online (Archaeopteryx):

Locally:

1. Download the newest version of the forester libraries: » forester.jar
2. Download an example phyloXML file:
» apaf.xml (Apaf-1 gene family tree with domain architectures) or
» bcl_2.xml (Bcl-2 gene family tree with gene duplications, support values, and taxonomy data)
3. Click on file "forester.jar" (Archaeopteryx should start; or use "java -cp path\to\forester.jar org.forester.archaeopteryx.Archaeopteryx") and use "File"|"Read tree from file..." to load file "bcl_2.xml" or "apaf.xml".

Rationale

Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types.

A well defined XML format addresses these problems in a general and extensible manner and allows for interoperability, both between and in between specialized and general purpose software.

Feature Highlights

Examples

phyloXML Support and Applications

phyloXML Google Group

The purposes of this group are (i) to inform about new developments concerning phyloXML and (ii) to provide a platform for discussing ideas regarding phyloXML.

phyloXML Google Group

Acknowledgements

Many of the ideas for the phyloXML format are the results of discussions by the members of the phyloXML Google Group. Ethalinda Cannon contributed to the development of ATV/Archaeopteryx and phyloXML. Additional progress on phyloXML and its implementations are due to the BioHackathon 2008 (towards integrated web service in life science with Open Bio* libraries). The BioPerl implementation of the phyloXML format was supported by Google Inc. as part of the Google Summer of Code™ 2008 program and was sponsored by the National Evolutionary Synthesis Center (NESCent). The BioRuby and Biopython implementations were supported by Google Inc. as part of the Google Summer of Code™ 2009 program and were sponsored by the National Evolutionary Synthesis Center (NESCent).

Reference

Han M.V. and Zmasek C.M. (2009)
"phyloXML: XML for evolutionary biology and comparative genomics"
BMC Bioinformatics, 10:356
[PubMed] [BMC Bioinformatics] [PDF] [Google Scholar]

Contacts

Christian Zmasek | phyloxml -at- gmail -dot- com

Valid HTML 4.01 Strict

© Copyright 2010-2013 CM Zmasek | All Rights Reserved | Last updated: 130731

phyloXML log | Archaeopteryx | forester