Ensure consistent coverage across ACCESS bait (or “probe”) regions
Theoretical Method
Coverage of each genomic region in the ACCESS panel is grouped on a per-sample basis, and a distribution of these values is plotted. Each sample is normalized by the median coverage value of that sample to align all peaks with one another and correct for sample-level differences.
Technical Methods
Tool Used:
Waltz CountReads
aggregate_bam_metrics.sh
tables_module.py
plots_module.r
Input
Collapsed, unfiltered bam
ACCESS pool A bed file
Output
intervals-coverage-sum.txt (one per bam type / pool combination)
coverage_per_interval.txt (one per sample / bam type / pool combination)
coverage_per_interval_A_targets_All_Unique.txt (this is used for graph above)
(DMP specific format?)
Interpretations Each distribution should be unimodal, apart from a second peak on the low end due to X chromosome mapping from male samples. Narrow peaks are indicative of evenly distributed coverage across all bait regions. Wider distributions indicate uneven read distribution, and may be correlated with a large GC bias. Note that the provided bed file lists start and stop coordinates of ACCESS design probes, not the actual genomic target regions.