Joining multiple .m8 files or merging multiple .daa files into one .mx file

Hi there!

I would like to analyze my metagenomics data using your tools (DIAMOND and MEGAN).
I have 8 samples containing a total of ~235 million reads, at ~230bp long.
I plan to split each sample into 4 smaller chunks to reduce DIAMOND processing time.

(1) If I set the -o parameter for .m8 output, will I be able to join the 4 .m8 outputs created for each sample into one large .m8 file?

(2) If I set the -a parameter instead, will I be able to join the 4 .daa outputs for each sample into one large .daa file?

(3) Still using the -a parameter, will I be able to merge the 4 .daa outputs during the conversion to blast tabular format? i.e. will I be able to convert 4 .daa files into one blast tabular file?

(4) is there a way to join multiple DIAMOND outputs into one MEGAN input file using MEGAN?

If the answer is yes to any of these questions, could you please share how? Any help will be greatly appreciated.

Many thanks in advance,

It doesn’t make sense to split the files, this won’t speed up DIAMOND.

(1) yes

(2) no

(3) no

(4) using the Import Blast dialog you can select multiple input files that give rise to one output rma6 file.
However, I do not recommend this. Rather, don’t split the 8 samples into smaller chunks but run as is.
Then use dat-meganizer (or equivalent File menu item of MEGAN) to meganize the 8 daa files. This will be the fastest route

Thank you for getting back Daniel.
I am running my analysis on a cluster and it’s nearly impossible to request all the resources I need to run diamond on my files as they are…I went ahead and split the files so that I can spread out the jobs without requesting any resources and it worked.

Also I have been able to join the .m8 outputs and they work in MEGAN!.

I will look into writing a DAA merger program…

that will be really useful.
Thank you!

Has the DAA merger program been added to MEGAN6 yet?

1 Like

HI Daniel,

I created many meganized .daa files and I also wondered if the merge script is on the way?

Thanks a lot for the nice program!

Dear Sebastian,

I believe that we do have a DAA-Merger program and I will look into providing it with the next MEGAN release


Dear Daniel,

that sounds great!

Was there ever a solution for merging daa files? I’m running into the same issue, I have a large dataset and the only practical solution for the alignments is running on a cluster in small chunks. Thanks.

We do have a DAA-merger program, I will look into adding it to MEGAN tools