Access Quality Control (v1)
  • Introduction
  • Meta information per sample
  • Raw read-pair counts (standard BAM)
  • On Target Coverage
  • Fraction of reads mapping to the human genome
  • “On Bait” reads localized to ACCESS panel
  • Coverage vs GC content
  • Insert Size Distribution
  • Distribution of ACCESS panel A coverage values
  • Average Coverage, Sample Level, Pool A Targets
  • UMI Family types Composition (Pool A)
  • Average Coverage, Sample Level, Pool B Targets
  • UMI Family types Composition (Pool B)
  • Base Quality Recalibration Scores
  • UMI family sizes (Simplex reads)
  • UMI family sizes (Duplex reads)
  • Sample Level Noise
  • Noise by Substitution Type
  • Contributing Sites for Noise
  • Hotspots In Normals
  • Sample mix-up
  • (Un)expected (Mis)matches Tables
  • Major Contamination
  • Minor Contamination
  • Duplex Minor Contamination
  • Sex Mismatch
  • FAQ
Powered by GitBook
On this page

Was this helpful?

Export as PDF

Minor Contamination

PreviousMajor ContaminationNextDuplex Minor Contamination

Last updated 4 years ago

Was this helpful?

Theoretical Method

Minor contamination check is done to see if a patient’s sample is contaminated with little DNA from another unrelated individual. This analysis is done using the ‘fingerprint’ SNPs identified in the .

FP_configuration file contains the chromosome, Position, Allele1, and Allele2 for the ‘fingerprinting’ SNPs. Allele1 and Allele2 identify that two common alleles per SNP positions and the order is arbitrary but in most cases, Allele1 is the more common variant.

Fingerprint SNPs in MSK-ACCESS-v1_0-TilingaAndFpSNPs.txt consist of the 31 SNPs designed as fingerprinting SNPs in target pool A and 279 Tiling SNPs from across target pool B. X chromosome SNPs were excluded and some other SNPs from the ACCESS panel were excluded based on heuristic from a sample set of 49 samples.

The Minor Contamination Rate is the average (mean) minor allele frequency from homozygous fingerprint SNPs.

We define the homozygous SNPs as sites with less than 10% minor allele frequency in either the Normal sequence data (if available in the same run) or the current sample sequence data.

These calculations were done using All Unique (unfiltered) bams for the m. Allele counts are measured from waltz pileups from Pool A and B

Technical Methods

  • Tool Used:

    • Waltz PileupMetrics

    • fingerprinting.py

  • Input

    • output_dir : Directory to write the Output files to

    • waltz_dir_A: Directory with waltz pileup files for target set A

    • waltz_dir_B: Directory with waltz pileup files for target set B

    • waltz_dir_A_duplex: Directory with waltz pileup files for Duplex target set A

    • waltz_dir_B_duplex: Directory with waltz pileup files for Duplex target set B

    • fp_config: File with information about the SNPs for analysis (MSK-ACCESS-v1_0-TilingaAndFpSNPs.txt)

    • title_file: Title File for the run

  • Output

    • FPResults/minorContamination.txt

    • MinorContaminationRate.pdf

Interpretations

Samples with Minor contamination rates of >0.002 are considered contamination.