Major Contamination
Last updated
Was this helpful?
Last updated
Was this helpful?
Theoretical Method
Major contamination plot is a bar plot of the fraction of heterozygous positions per sample and is done to see if a patient’s sample is contaminated with DNA from an unrelated individual. This analysis also done using the ‘fingerprint’ SNPs in the panel. A SNP is considered heterozygous if the minor allele fraction is > 0.1.
The fraction of heterozygous positions in the sample is found using the formula below:
These calculations were done using All Unique (unfiltered) bams. Allele counts are measured from waltz pileups from Pool A and B
Technical Methods
Tool Used:
Waltz PileupMetrics
fingerprinting.py
Input
output_dir : Directory to write the Output files to
waltz_dir_A: Directory with waltz pileup files for target set A
waltz_dir_B: Directory with waltz pileup files for target set B
waltz_dir_A_duplex: Directory with waltz pileup files for Duplex target set A
waltz_dir_B_duplex: Directory with waltz pileup files for Duplex target set B
fp_config: File with information about the SNPs for analysis (MSK-ACCESS-v1_0-TilingaAndFpSNPs.txt)
title_file: Title File for the run
Output
FPResults/majorContamination.txt
MajorContaminationRate.pdf
Interpretations
The fraction of heterozygous positions should be around 0.5. If the fraction is greater than 0.6, it is is considered to have major contamination.