Filter Calls

Step 2 -- filtering

The second step takes all the genotypes generated from the first step and organized into a patient level variants table with VAFs and call status for each variant of each sample.

Each call is subjected to:

  1. Read depth filter (hotspot vs non-hotspot)

  2. Systematic artifact filter

  3. Germline filters

    1. If any normal exist -- (buffy coat and DMP normal) 2:1 rule

    2. If not -- exac freq < 0.01% and VAF < 30%

  4. CH tag

Usage

Rscript R/filter_calls.R -h                                         
usage: R/filter_calls.R [-h] [-m MASTERREF] [-o RESULTSDIR] [-dmpk DMPKEYPATH]
                        [-ch CHLIST] [-c CRITERIA]

optional arguments:
  -h, --help            show this help message and exit
  -m MASTERREF, --masterref MASTERREF
                        File path to master reference file
  -o RESULTSDIR, --resultsdir RESULTSDIR
                        Output directory
  -ch CHLIST, --chlist CHLIST
                        List of signed out CH calls [default]
  -c CRITERIA, --criteria CRITERIA
                        Calling criteria [default]

Default

Default options can be found here

What filter_calls.R does

Generate a reference of systematic artifacts -- any call with occurrence in more than or equal to 2 donor samples (occurrence defined as more than or equal to 2 duplex reads)

We suggest that you filter out anything with duplex_support_num >= 2

  1. Read in sample sheets -- reference for downstream analysis

  2. Call status annotation

    1. All call passing read depth/genotype filter annotated as 'Called' or 'Genotyped'

    2. Any call not satisfying germline filters are overwritten with 'Not Called'

      1. Calls with zero coverage in plasma sample also annotated as 'Not Covered'

  3. Final processing

    1. Combining duplex and simplex read counts

  4. Write out table

Example of the patient level table:

Hugo_Symbol

Start_position

Variant_Classification

Other variant descriptions

...

C-xxxxxx-L001-d___duplex.called

C-xxxxxx-L001-d___duplex.total

C-xxxxxx-L002-d___duplex.called

C-xxxxxx-L001-d___duplex.total

C-xxxxxx-N001-d___unfilterednormal

P-xxxxxxx-T01-IM6___DMP_Tumor

P-xxxxxxx-T01-IM6___DMP_Normal

KRAS

xxxxxx

Missense Mutation

...

...

Called

15/1500(0.01)

Not Called

0/1800(0)

0/200(0)

200/800(0.25)

1/700(0.001)

Last updated