UMI family sizes (Duplex reads)

Understanding the frequency of UMI families of different read counts

Theoretical Method

Similarly for the Simplex read pairs, we investigate the number of families of each discrete size for duplex reads, which consist of fragments with at least 1 read pair mapping on each of the top and bottom strands.

Technical Methods

  • Tools Used:

    • Marianas

    • make_umi_qc_tables.sh

  • Input

    • collapsed_R1_.fastq

    • collapsed_R2_.fastq

    • MSK-ACCESS-v1_0-A-on-target-positions.txt

    • MSK-ACCESS-v1_0-B-on-target-positions.txt

  • Output

    • family-sizes.txt

Interpretations

We expect duplex family size peak between 5 and 15 read pairs, which gives us confidence that there are enough unique molecules for adequate error correction during the collapsing process.

Last updated