Access Quality Control (v1)
  • Introduction
  • Meta information per sample
  • Raw read-pair counts (standard BAM)
  • On Target Coverage
  • Fraction of reads mapping to the human genome
  • “On Bait” reads localized to ACCESS panel
  • Coverage vs GC content
  • Insert Size Distribution
  • Distribution of ACCESS panel A coverage values
  • Average Coverage, Sample Level, Pool A Targets
  • UMI Family types Composition (Pool A)
  • Average Coverage, Sample Level, Pool B Targets
  • UMI Family types Composition (Pool B)
  • Base Quality Recalibration Scores
  • UMI family sizes (Simplex reads)
  • UMI family sizes (Duplex reads)
  • Sample Level Noise
  • Noise by Substitution Type
  • Contributing Sites for Noise
  • Hotspots In Normals
  • Sample mix-up
  • (Un)expected (Mis)matches Tables
  • Major Contamination
  • Minor Contamination
  • Duplex Minor Contamination
  • Sex Mismatch
  • FAQ
Powered by GitBook
On this page

Was this helpful?

Export as PDF

UMI family sizes (Simplex reads)

Understanding the frequency of UMI families of different read counts

PreviousBase Quality Recalibration ScoresNextUMI family sizes (Duplex reads)

Last updated 4 years ago

Was this helpful?

Theoretical Method

In this plot we investigate the number of families of each discrete size for simplex reads, which consist of 3 or more read pairs from one of the two strands.

Technical Methods

  • Tools Used:

    • Marianas

    • make_umi_qc_tables.sh

  • Input

    • collapsed_R1_.fastq

    • collapsed_R2_.fastq

    • MSK-ACCESS-v1_0-A-on-target-positions.txt

    • MSK-ACCESS-v1_0-B-on-target-positions.txt

  • Output

    • family-sizes.txt

Interpretations

This graph begins at family sizes of 3, for which the largest number of families should occur, and drops off after that.