Problems with parsing tabular blast


#1

Hi, I have problems with parsing a tabular blast file with MEGAN 6.5.10. Using a provided mapping file like prot-gi2tax-August2016X.bin, some GIs don’t get assigned, though there is a known taxid mapping (e.g. gi|1062853291| should be assigned to the taxid 9925, instead the Inspector shows a ‘?’). On the other hand, addind an additional column with the taxids to the tabular blast output, like the Megan manual suggests (“However, if you add an additional column to this format containing the associated taxon name or numerical NCBI taxon-id for each line then MEGAN will parse these and use them as input.”) doesn’t work for me: MEGAN doesn’t recognize the format and no reads get assigned at all. It used to work in MEGAN5, so I must be doing something wrong here? Thank you very much and best regards.


#2

Please upload a few lines of a typical file that exhibits the problems and I will look into it


#3

Here are two lines of a tabular blast with an additional taxid-column:

gi|548452367|ref|XP_005675727.1|||1||abcL gi|1062853286|ref|XP_017907820.1| 100.000 15 0 0 1 15 483 497 1.28e-06 56.6 9925
gi|548452367|ref|XP_005675727.1|||1||abcL gi|942074311|ref|XP_005902399.2| 100.000 15 0 0 1 15 483 497 1.28e-06 56.6 72004

Actually, it seems that if the file ending does not have an expected blast file ending like “.tab”, the parsing doesn’t work; maybe this was the problem.

After some more trying I found that a tag like ‘|tax|9925’ that is integrated into the subject id (like ‘gi|1062853286|ref|XP_017907820.1|tax|9925’) combined with a proper file ending works, so I’ll go with this.

I still have one situation, in which a read should be assigned, but it’s not. I’ll send you the respective file by mail. Thank you.