arrow-left

All pages
gitbookPowered by GitBook
1 of 3

Loading...

Loading...

Loading...

Vardict

This hosts multiple scripts necessary for filtering and processing variant calls in the vcfs/txt file generated by callers.

hashtag
Callers Supported

pv is the main command for the postprocessing_variant_calls package see pv --help to see supported variant callers commands.

hashtag
Vardict

The sub-command pv vardict allows users to perform post-processing on VarDictJava output. The two supported inputs to pv vardict from VarDictJava are single and case-control vcfs.

To specify to pv vardict, which input type will be used one of the following sub-commands may be used:

  • pv vardict single for single sample vcfs

  • pv vardict case-control for case-controlled vcfs.

Next the user can specify, what post-processing should be done. Right now, postprocessing_variant_calls supports filtering:

  • pv vardict single filter

  • pv vardict case-control filter

Finally, we can specify the paths and options for our filtering and run our command. Here is an example using the test data provided in this repository:

pv vardict single filter --inputVcf data/Myeloid200-1.vcf --tsampleName Myeloid200-1 -ad 1 -o data/single

There are various options and input specifications for filtering so see pv vardict single filter --help or pv vardict single case-sontrol --help for help.

See example_calls.sh for more example calls.

hashtag
How the repo was made

Template used: https://github.com/yxtay/python-project-template

hashtag
Usage

hashtag
External dependencies

  • [Conda][conda]

  • [Docker][docker]

  • [Make][make]

hashtag
Create environment

Use Conda to create a virtual environment and activate it for the project.

hashtag
Install dependencies

Then install project dependencies with Poetry.

hashtag
Updating Environment

To update the environment after initial setup up run:

instead of conda create, and then re-run make deps-install

hashtag
Visual representation of how this module works:

Leveraging the PyVcf package the following filtering is performed:

Case 1: Single sample mode

Case 2: Case-control mode

Abbreviations

  • TVF - Tumor Variant Fraction

  • NVF - Normal Variant Fraction

  • tmq - tumor minimum quality

  • nmq - normal minimum quality

  • tdp - total depth

  • tad - total allele depth

  • conda env create -f environment.yml
    conda activate pv_calls
    make deps-install
    conda env update -f environment.yml

    MAF

    The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

    hashtag
    INPUT AND OUTPUT DESCRIPTION

    hashtag
    INPUT

    At a minimum, each of these commands assumes a MAF file to be a well-defined object with the following characteristics:

    • a delimited file where the delimiter is either a '\t' or a ','

    • the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'

    • The delimited file at A minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"

    However, some commands and their sub-commands may require additional columns and may use specific rules in their processing of the MAF file.

    hashtag
    OUTPUT

    Output is a MAF file which is modified as per the operation of each command,

    hashtag
    USAGE

    For specifics on these criteria and rules, please find additional documentation on these commands below:

    The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

    At minimum each of these commands assumes a maf file to be a well-defined object with the following characteristics:

    • a delimited file where the delimiter is either a '\t' or a ','

    • the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'

    • The delimited file at minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"

    These are the minimum requirements for a maf being used in these post-processing commands.

    However, some commands and their sub-commands may require additional criteria of the maf file. Additionally, they may also use specific rules in their processsing of the maf file.

    For specifics on these criteria and rules, please find additional documentation on these commands below:

    maf concat examples:

    • pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf

    • pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf -h header.txt where header.txt is a header file with names by which the mafs will be row-wise concatenated. See resources/header.txt for an example.

    maf annotate examples:

    • pv maf mafbybed -m path/to/maf.maf -b path/to/maf.bed -o output/path/file -c annotation

    • pv maf annotate mafbytsv -m /path/to/maf.(tsv/csv/maf) -t path/to/tsv.tsv -sep tsv -oc hotspot -v "Yes" "No"

    maf tag examples:

    • pv maf tag cmoch -m path/to/maf.maf -o output/path/file -sep "tsv"

    • pv maf tag common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"

    • pv maf tag germline_status -m path/to/maf.maf -o output/path/file -sep "tsv"

    maf filter examples:

    • pv maf filter cmo_ch -m path/to/maf.maf -o output/path/file -sep "tsv"

    • pv maf filter hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"

    • pv maf filter mappable -m path/to/maf.maf -o output/path/file -sep "tsv"

    The minimum listed columns can be combined into a unique ID for each row.
    The minimum listed columns can be combined into a unique id for each row.
    pv maf -p path/to/paths.txt -o output/path/file where path/to/paths.txt is a txt file with maf path locations. See resources/paths.txt for an example.

    pv maf tag prevalence_in_cosmicDB -m path/to/maf.maf -o output/path/file -sep "tsv"

  • pv maf tag truncating_mut_in_TSG -m path/to/maf.maf -o output/path/file -sep "tsv"

  • pv maf filter non_common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"

  • pv maf filter non_hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"

  • pv maf filter not_complex -m path/to/maf.maf -o output/path/file -sep "tsv"

  • Command Description and Explanation