The taxonomy algorithm ignores reads from a specific point onwards

Hello,

Recently, I’ve tried using a newer version of MEGAN6 Community Edition (MEGAN 6.7.6) to parse the BlastTAB output from Diamond’s BlastX mode, using NCBI nr from February 2017.

For one of the data sets, taxonomy algorithm malfunctioned with no error. The reported number of reads corresponded to the number of unique reads in the BlastTAB ouptput (= reads with hits), but the sum of all assigned reads and the unassigned reads was not even close to that amount. A problem must have occurred during the Naive LCA algorithm, as the “Total reads” and “Alignments” after the parsing phase drastically exceeded the “Total reads” and “Alignments” after the Naive LCA algorithm phase, as stated in the Messages box.

After a specific point, reads in the file got dropped and were not represented anywhere in the taxonomy tree. The specific point is reproducible and is included within the attached segment of the BlastX/Diamond output. The last properly parsed read is 63H9Q:00896:01124. The same file could be parsed properly with MEGAN 6.5.8. Additional details about the parameters and conditions of parsing are included below. Do you have any ideas about the problem?

MEGAN_taxonomyproblem_40000lines.tabular (2.8 MB)

Best regards,
Marko

The parameters were:

  • MinScore 100,
  • MaxExpected 0.01,
  • Min Percent Identity 0,
  • Top Percent 10,
  • Min Support Percent 0 (off),
  • Min Support 313,
  • Naive LCA algorithm

The database used for Diamond was newer than Nov2016 and I used the prot_acc2tax-Nov2016 (didn’t bother MEGAN 6.5.8).

Dear Marko,

Thanks for the bug report. It took me a couple of hours to figure this one out. Your file exposed a subtle bug in the code that accesses reads and alignments in an rma6 file. It threw an exception that caused the remaining reads to be ignored. I have fixed the problem, please update to 6.7.10.
D

Thank you for the swift reply!
I have just tried parsing the same file with the freshly installed 6.7.10, but the outcome is unfortunately still the same for me.

Best regards,
Marko

Could you please provide a file for which the problem still persists because the program definitely works on the file that you provide in this link:

As you can see here:

That is strange. I used the same file that I uploaded here. After processing the file on 6.7.10 and again after uninstalling and reinstalling 6.7.10, the result was still the same (Screenshots below, 6.5.8 for comparison).

For what it’s worth, we’re using a BioLinux system and MEGAN 6.7.10 was installed using the MEGAN_Community_unix_6_7_10.sh installer. It was installed in parallel in its own directory, leaving the previous versions intact.

The screenshots:

Dear Marko,

could you please re-download the MEGAN installer. Apparently, there was a slip-up when I uploaded the latest version of MEGAN and the version number 6.7.10 was used twice (shouldn’t happen, but I was boarding a flight when I uploaded the fix to the problem that you had reported, so must have messed up). The build date on your MEGAN window title bar is March 24th, on mine it is March 25th…
D

Dear Daniel,

As far as I can tell after downloading the installer and testing the program on the example file, the link on the MEGAN download page (http://ab.inf.uni-tuebingen.de/data/software/megan6/download/MEGAN_Community_unix_6_7_10.sh) retrieves the same file as before, the one built on March 24.

Best regards,
Marko

sorry for that, I will build a new release (6.7.11) tomorrow and that will solve the problem.

So, I’ve tried the newest version and as far as I can tell, it works as it should.
Thank you for solving the issue!

Datenschutzerklärung