LCA binner to non redundant queries


#1

Hi,

I have a sample with a lot of reads placed at the root of the taxonomy. When viewing the alignments of these reads, a large majority of them map to a single species or genus, still they get binned at the root. Upon further inspection, I noticed all matches are to references with the WP_ tag in the header. NCBI RefSeq documentation (https://www.ncbi.nlm.nih.gov/refseq/about/nonredundantproteins/) says this is the way of naming non redundant sequences that are found in multiple organisms, these references are associated to the LCA of all the taxa they are found.

Is there any particular reason a read is binned at the root?

megan UE 6.12.3

J


#2

Hi Julian,

can you give me access to some of the reads that get binned to the root (e.g. using the Export->Reads
and Export->Matches menu items) and I will take a look at this.