I’ve been quite busy and didn’t get to test this until yesterday. Unfortunately, I still didn’t manage to get MEGAN to classify anything. I have trouble testing this with the MEGAN GUI because all my data is on a GUI-less server.
I want to repeat all my steps to make clear what exactly I’m doing. I hope you can check this out and maybe see what I’m doing wrong.
STEP 1. Map reads to a reference database using Last aligner. This produces quite large .maf file. The reference database is a .fna file with headers like this:
‘>kraken:taxid|272844|NC_000868.1 Pyrococcus abyssi GE5, complete genome’
STEP 2. Convert .maf file to .daa file using maf2daa tool. This produces much smaller .daa file that can be loaded into MEGAN.
STEP 3. Prepare results for analysis with MEGAN using daa-meganizer tool. Set option for long reads (-lg).
/daa-meganizer -lg -i Mock_100000-bacteria-l1000-q10_2BacAr.daa -mdb /mnt/5TB/megan/megan-db/megan-map-Oct2019-ue.db
This is where I might be doing something wrong. The output of daa-meganizer tool says that it only managed to classify into two classes (as I can understand it). Last part of daa-meganizer output:
Class. Taxonomy: 2
Class. SEED: 2
Class. EGGNOG: 2
Class. KEGG: 2
Class. INTERPRO2GO: 2
I’m not sure if I should be running daa-meganizer at all when mapping to nucleotide database.
STEP 4. Run MEGAN from command line with the following script.
set idParsing=true cName=Taxonomy prefix=‘kraken:taxid|’;
export what=CSV format=taxonId_to_percent separator=comma counts=assigned
This results in a really small .csv file: