Compile Reads
Step 1 -- intra-patient genotyping
There are two variantion:
compile_reads.R : Works with Research ACCESS and Clinical IMPACT
compile_reads_all.R: Works with Research ACCESS, Clinical ACCESS and Clinical IMPACT
The first step of the pipeline is to genotype all the variants of interest in the included samples (this means plasma, buffy coat, DMP tumor, DMP normal, and donor samples). Once we obtained the read counts at every loci of every sample, we then generate a table of VAFs and call status for each variant in all samples within a patient in the next step.
Usage compile_reads.R
Rscript R/compile_reads.R -h
usage: R/compile_reads.R [-h] [-m MASTERREF] [-o RESULTSDIR]
[-pb POOLEDBAMDIR] [-fa FASTAPATH]
[-gt GENOTYPERPATH] [-dmp DMPDIR] [-mb MIRRORBAMDIR]
[-dmpk DMPKEYPATH]
optional arguments:
-h, --help show this help message and exit
-m MASTERREF, --masterref MASTERREF
File path to master reference file
-o RESULTSDIR, --resultsdir RESULTSDIR
Output directory
-pb POOLEDBAMDIR, --pooledbamdir POOLEDBAMDIR
Directory for all pooled bams [default]
-fa FASTAPATH, --fastapath FASTAPATH
Reference fasta path [default]
-gt GENOTYPERPATH, --genotyperpath GENOTYPERPATH
Genotyper executable path [default]
-dmp DMPDIR, --dmpdir DMPDIR
Directory of clinical DMP IMPACT repository [default]
-mb MIRRORBAMDIR, --mirrorbamdir MIRRORBAMDIR
Mirror BAM file directory [default]
-dmpk DMPKEYPATH, --dmpkeypath DMPKEYPATH
DMP mirror BAM key file [default]
Usage compile_reads_all.R
Rscript R/compile_reads_all.R -h
usage: R/compile_reads_all.R [-h] [-m MASTERREF] [-o RESULTSDIR]
[-pid PROJECTID] [-pb POOLEDBAMDIR]
[-fa FASTAPATH] [-gt GENOTYPERPATH] [-dmp DMPDIR]
[-mb MIRRORBAMDIR] [-mab MIRRORACCESSBAMDIR]
[-dmpk DMPKEYPATH] [-dmpak DMPACCESSKEYPATH]
optional arguments:
-h, --help show this help message and exit
-m MASTERREF, --masterref MASTERREF
File path to master reference file
-o RESULTSDIR, --resultsdir RESULTSDIR
Output directory
-pid PROJECTID, --projectid PROJECTID
Project ID for submitted jobs involved in this run
-pb POOLEDBAMDIR, --pooledbamdir POOLEDBAMDIR
Directory for all pooled bams [default]
-fa FASTAPATH, --fastapath FASTAPATH
Reference fasta path [default]
-gt GENOTYPERPATH, --genotyperpath GENOTYPERPATH
Genotyper executable path [default]
-dmp DMPDIR, --dmpdir DMPDIR
Directory of clinical DMP repository [default]
-mb MIRRORBAMDIR, --mirrorbamdir MIRRORBAMDIR
Mirror BAM file directory [default]
-mab MIRRORACCESSBAMDIR, --mirroraccessbamdir MIRRORACCESSBAMDIR
Mirror BAM file directory for MSK-ACCESS [default]
-dmpk DMPKEYPATH, --dmpkeypath DMPKEYPATH
DMP mirror BAM key file [default]
-dmpak DMPACCESSKEYPATH, --dmpaccesskeypath DMPACCESSKEYPATH
DMP mirror BAM key file for MSK-ACCESS [default]
Default
Default options can be found here
What compile_reads
does
compile_reads
doesCreate a sample sheet -- similar to the one for
genotype-variants
Sample_Barcode
duplex_bams
simplex_bams
standard_bam
Sample_Type
dmp_patient_id
plasma sample id
/duplex/bam
/simplex/bam
NA
duplex
P-xxxxxxx
buffy coat id
NA
NA
/unfiltered/bam
unfilterednormal
P-xxxxxxx
DMP Tumor ID
NA
NA
/DMP/bam
DMP_Tumor
P-xxxxxxx
DMP Normal ID
NA
NA
/DMP/bam
DMP_Normal
P-xxxxxxx
Generate all variants of interests
DMP calls from cbio repo
ACCESS calls from SNV pipeline
Genotype with genotype-variants
Afterwards, for donor bams
Obtain all variants genotyped in any patient, generate a all unique list of variants
Genotype with genotype-variants
Last updated
Was this helpful?