How can I obtain genome length covered by reads?


#1

Hello,
In order to normalize my results, I want to obtain the gene or genome length that is coverage by at least 1x depth (number of bases of a reference taxon that is cover) . But when I export taxonId to length it seems to export the sum of the contigs lengths even do they are sometime partial overlapped. For example, suppose that each base is one character, in one taxon I have:

case 1:
GGGGGGGGGGG Reference genome (Length =11)
–RR____RRR (Length read1=2, length read2=3)
RR____ RR (Length read3=2, length read4=2)

taxonId to length = 9 and I will like to obtain 7

case 2:
GGGGGGGGG Reference genome (Length =11)
____ RRR (Length read1=3)
___RRR (Length read2=3)
______RRR (Length read3=3)

the same, taxonId to length = 9 and I will like to obtain 5

For example, in the first case, reference genome covere is longest than case 2, but by exporting taxonId to length it give me the same thing. ¿It is possible to calculate or obtain in some way this information?

Best regards,

Blanca


#2

I think that the “Export Read Lengths and Coverage” menu item will suit your needs. This is broken in the current version of MEGAN, so I have fixed it and will upload a new release today (look for 6.14.2)


#3

Hello Daniel,

I have a cupple of questions regarding this new version. For example, I exported read lengths and coverage for some type of taxons (using read magnitud and weighted LCA algortih) from this result:

  1. First, the resul had the same reads a lot of times, not only one.(NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176 -> 8 times, NS500560_00101_FC_HK5GTBGX2:4:22404:10327:13104 -< 4 times and so on…)
    |NS500560_00101_FC_HK5GTBGX2:4:21512:3509:11261#TAAGGCGA+ATAGAGAG/2|76|10|65|
    |—|---|—|---|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/1|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/2|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:4:22404:10327:13104#AAAGGCGA+ATAGAGAG/2|76|10|73|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/2|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/2|76|10|68|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/1|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/2|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:4:22404:10327:13104#AAAGGCGA+ATAGAGAG/2|76|10|73|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/2|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/2|76|10|68|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/1|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/2|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:4:22404:10327:13104#AAAGGCGA+ATAGAGAG/2|76|10|73|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/2|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:21102:3652:4274#AAAGGCGA+ATAGAGAG/2|76|10|68|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/1|76|10|75|
    |NS500560_00101_FC_HK5GTBGX2:2:13312:3513:10903#TAAGGCGA+ATAGAGAG/1|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:1:11111:22955:5814#TAAGGCGA+ATAGAGAG/2|76|10|76|
    |NS500560_00101_FC_HK5GTBGX2:4:22404:10327:13104#AAAGGCGA+ATAGAGAG/2|76|10|73|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/1|76|1|49|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/2|76|8|63|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/1|76|1|49|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/2|76|8|63|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/1|76|1|49|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/2|76|8|63|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/1|76|1|49|
    |NS500560_00101_FC_HK5GTBGX2:2:23103:18027:7176#TAAGGCGA+ATAGAGAG/2|76|8|63|
    |NS500560_00101_FC_HK5GTBGX2:4:11410:9122:1513#TAAGGCGA+ATAGAGAG/2|76|10|76|

  2. What “number-of-alignments” means? For the same read, it gives me different numbers.

Best regards.


#4

Hello,
Have you been able to see my answer? I have several doubts