Access Quality Control (v1)
  • Introduction
  • Meta information per sample
  • Raw read-pair counts (standard BAM)
  • On Target Coverage
  • Fraction of reads mapping to the human genome
  • “On Bait” reads localized to ACCESS panel
  • Coverage vs GC content
  • Insert Size Distribution
  • Distribution of ACCESS panel A coverage values
  • Average Coverage, Sample Level, Pool A Targets
  • UMI Family types Composition (Pool A)
  • Average Coverage, Sample Level, Pool B Targets
  • UMI Family types Composition (Pool B)
  • Base Quality Recalibration Scores
  • UMI family sizes (Simplex reads)
  • UMI family sizes (Duplex reads)
  • Sample Level Noise
  • Noise by Substitution Type
  • Contributing Sites for Noise
  • Hotspots In Normals
  • Sample mix-up
  • (Un)expected (Mis)matches Tables
  • Major Contamination
  • Minor Contamination
  • Duplex Minor Contamination
  • Sex Mismatch
  • FAQ
Powered by GitBook
On this page

Was this helpful?

Export as PDF

Coverage vs GC content

Awareness of possible loss of accuracy in downstream sequencing results due to coverage bias

Previous“On Bait” reads localized to ACCESS panelNextInsert Size Distribution

Last updated 4 years ago

Was this helpful?

Theoretical Method

Bin GC content of each region in the bam file into 5% intervals, and plot mean coverage across all regions that fall into each bin.

Technical Methods

  • Tool Used:

    • Waltz CountReads

    • aggregate_bam_metrics.sh

    • tables_module.py

    • plots_module.r

  • Input

    • Standard bam

    • Collapsed unfiltered bam

    • ACCESS pool A bed file

  • Output

    • sample_id-intervals.txt

Interpretations Extreme base compositions, i.e., GC-poor or GC-rich sequences, lead to an uneven coverage or even no coverage of reads across the genome. This can affect downstream small variant and copy number calling. Both of which rely on consistent sequencing depth across all regions. Ideally this plot should be as flat as possible. The above example depicts a slight decrease in coverage at really high GC-rich regions, but is a good result for ACCESS.