MEGAN can now import the lastest BIOM format (2.1)


#1

I have implemented a first version of BIOM2 format support. This is the format that QIIME now uses. You will be able to open any file ending on .biom, whether it is in BIOM1 format (the original BIOM format, based on JSON) or BIOM2 (a new version of the format that is based on HDF5).

QIIME reports taxonomic assignments in a “path format” that looks like this:

k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Ruminococcaceae;g__Faecalibacterium;s__prausnitzii

During import, MEGAN needs to decide which NCBI taxon this should be mapped to.

MEGAN provides a choice of two algorithms for doing this:

  • Match taxonomic path. In this mode, MEGAN will place this assignment onto the most specific taxon whose path in the NCBI taxonomy contains the QIIME path.
  • Match most specific node. Matches the most specific node regardless of whether the path is preserved.

Because a path reported by QIIME does not always match the path in the NCBI taxonomy, in practice the first choice is more conservative, while the second choice is more specific.

Unfortunately, the BIOM format is underspecified and there are no examples for many of the indicated features of the format, so the current in MEGAN is incomplete.

If you have any problems loading your biom files into MEGAN then please let me know and I will extend MEGAN’s BIOM2 parser accordingly.