arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Major Contamination

Theoretical Method

Major contamination plot is a bar plot of the fraction of heterozygous positions per sample and is done to see if a patient’s sample is contaminated with DNA from an unrelated individual. This analysis also done using the ‘fingerprint’ SNPs in the panel. A SNP is considered heterozygous if the minor allele fraction is > 0.1.

The fraction of heterozygous positions in the sample is found using the formula below:

Fractionheterozygouspositions=(NumberofHeterozygousSites)/(TotalNumberofFingerprintSNPs)Fraction heterozygous positions=(Number of Heterozygous Sites)/(Total Number of Fingerprint SNPs)Fractionheterozygouspositions=(NumberofHeterozygousSites)/(TotalNumberofFingerprintSNPs)
circle-info

These calculations were done using All Unique (unfiltered) bams. Allele counts are measured from waltz pileups from Pool A and B

Technical Methods

  • Tool Used:

    • Waltz PileupMetrics

    • fingerprinting.py

Interpretations

The fraction of heterozygous positions should be around 0.5. If the fraction is greater than 0.6, it is is considered to have major contamination.

Input

  • output_dir : Directory to write the Output files to

  • waltz_dir_A: Directory with waltz pileup files for target set A

  • waltz_dir_B: Directory with waltz pileup files for target set B

  • waltz_dir_A_duplex: Directory with waltz pileup files for Duplex target set A

  • waltz_dir_B_duplex: Directory with waltz pileup files for Duplex target set B

  • fp_config: File with information about the SNPs for analysis (MSK-ACCESS-v1_0-TilingaAndFpSNPs.txt)

  • title_file: Title File for the run

  • Output

    • FPResults/majorContamination.txt

    • MajorContaminationRate.pdf