1 of 3

Command Description and Explanation

MAF

The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

INPUT AND OUTPUT DESCRIPTION

INPUT

At a minimum, each of these commands assumes a MAF file to be a well-defined object with the following characteristics:

a delimited file where the delimiter is either a '\t' or a ','
the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'
The delimited file at A minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"
The minimum listed columns can be combined into a unique ID for each row.

However, some commands and their sub-commands may require additional columns and may use specific rules in their processing of the MAF file.

OUTPUT

Output is a MAF file which is modified as per the operation of each command,

USAGE

For specifics on these criteria and rules, please find additional documentation on these commands below:

The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

At minimum each of these commands assumes a maf file to be a well-defined object with the following characteristics:

a delimited file where the delimiter is either a '\t' or a ','
the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'
The delimited file at minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"
The minimum listed columns can be combined into a unique id for each row.

These are the minimum requirements for a maf being used in these post-processing commands.

However, some commands and their sub-commands may require additional criteria of the maf file. Additionally, they may also use specific rules in their processsing of the maf file.

For specifics on these criteria and rules, please find additional documentation on these commands below:

maf concat examples:

pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf
pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf -h header.txt where header.txt is a header file with names by which the mafs will be row-wise concatenated. See resources/header.txt for an example.
pv maf -p path/to/paths.txt -o output/path/file where path/to/paths.txt is a txt file with maf path locations. See resources/paths.txt for an example.

maf annotate examples:

pv maf mafbybed -m path/to/maf.maf -b path/to/maf.bed -o output/path/file -c annotation
pv maf annotate mafbytsv -m /path/to/maf.(tsv/csv/maf) -t path/to/tsv.tsv -sep tsv -oc hotspot -v "Yes" "No"

maf tag examples:

pv maf tag cmoch -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag germline_status -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag prevalence_in_cosmicDB -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag truncating_mut_in_TSG -m path/to/maf.maf -o output/path/file -sep "tsv"

maf filter examples:

pv maf filter cmo_ch -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter mappable -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter non_common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter non_hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter not_complex -m path/to/maf.maf -o output/path/file -sep "tsv"

Vardict

This hosts multiple scripts necessary for filtering and processing variant calls in the vcfs/txt file generated by callers.

Callers Supported

pv is the main command for the postprocessing_variant_calls package see pv --help to see supported variant callers commands.

Vardict

The sub-command pv vardict allows users to perform post-processing on VarDictJava output. The two supported inputs to pv vardict from VarDictJava are single and case-control vcfs.

To specify to pv vardict, which input type will be used one of the following sub-commands may be used:

pv vardict single for single sample vcfs
pv vardict case-control for case-controlled vcfs.

Next the user can specify, what post-processing should be done. Right now, postprocessing_variant_calls supports filtering:

pv vardict single filter
pv vardict case-control filter

Finally, we can specify the paths and options for our filtering and run our command. Here is an example using the test data provided in this repository:

pv vardict single filter --inputVcf data/Myeloid200-1.vcf --tsampleName Myeloid200-1 -ad 1 -o data/single

There are various options and input specifications for filtering so see pv vardict single filter --help or pv vardict single case-sontrol --help for help.

See example_calls.sh for more example calls.

How the repo was made

Template used: https://github.com/yxtay/python-project-template

Usage

External dependencies

[Conda][conda]
[Docker][docker]
[Make][make]

Create environment

Use Conda to create a virtual environment and activate it for the project.

Install dependencies

Then install project dependencies with Poetry.

Updating Environment

To update the environment after initial setup up run:

instead of conda create, and then re-run make deps-install

Visual representation of how this module works:

Leveraging the PyVcf package the following filtering is performed:

Case 1: Single sample mode

Case 2: Case-control mode

Abbreviations

TVF - Tumor Variant Fraction
NVF - Normal Variant Fraction
tmq - tumor minimum quality
nmq - normal minimum quality
tdp - total depth
tad - total allele depth

Vardict

This hosts multiple scripts necessary for filtering and processing variant calls in the vcfs/txt file generated by callers.

Callers Supported

pv is the main command for the postprocessing_variant_calls package see pv --help to see supported variant callers commands.

Vardict

The sub-command pv vardict allows users to perform post-processing on VarDictJava output. The two supported inputs to pv vardict from VarDictJava are single and case-control vcfs.

To specify to pv vardict, which input type will be used one of the following sub-commands may be used:

pv vardict single for single sample vcfs
pv vardict case-control for case-controlled vcfs.

Next the user can specify, what post-processing should be done. Right now, postprocessing_variant_calls supports filtering:

pv vardict single filter
pv vardict case-control filter

Finally, we can specify the paths and options for our filtering and run our command. Here is an example using the test data provided in this repository:

pv vardict single filter --inputVcf data/Myeloid200-1.vcf --tsampleName Myeloid200-1 -ad 1 -o data/single

There are various options and input specifications for filtering so see pv vardict single filter --help or pv vardict single case-sontrol --help for help.

See example_calls.sh for more example calls.

How the repo was made

Template used: https://github.com/yxtay/python-project-template

Usage

External dependencies

[Conda][conda]
[Docker][docker]
[Make][make]

Create environment

Use Conda to create a virtual environment and activate it for the project.

conda env create -f environment.yml
conda activate pv_calls

Install dependencies

Then install project dependencies with Poetry.

make deps-install

Updating Environment

To update the environment after initial setup up run:

conda env update -f environment.yml

instead of conda create, and then re-run make deps-install

Visual representation of how this module works:

Leveraging the PyVcf package the following filtering is performed:

Case 1: Single sample mode

Case 2: Case-control mode

Abbreviations

TVF - Tumor Variant Fraction
NVF - Normal Variant Fraction
tmq - tumor minimum quality
nmq - normal minimum quality
tdp - total depth
tad - total allele depth

MAF

The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

INPUT AND OUTPUT DESCRIPTION

INPUT

At a minimum, each of these commands assumes a MAF file to be a well-defined object with the following characteristics:

a delimited file where the delimiter is either a '\t' or a ','
the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'
The delimited file at A minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"
The minimum listed columns can be combined into a unique ID for each row.

However, some commands and their sub-commands may require additional columns and may use specific rules in their processing of the MAF file.

OUTPUT

Output is a MAF file which is modified as per the operation of each command,

USAGE

For specifics on these criteria and rules, please find additional documentation on these commands below:

The sub-command pv maf allows users to perform post-processing on maf files. It has has six sub-commands: annotate, concat, filter, mergetsv, subset, tag.

At minimum each of these commands assumes a maf file to be a well-defined object with the following characteristics:

a delimited file where the delimiter is either a '\t' or a ','
the file uses one of the following extension: '.maf', '.txt', '.csv', 'tsv'
The delimited file at minimum includes the following columns: "Chromosome","Start_Position","End_Position","Reference_Allele","Tumor_Seq_Allele2"
The minimum listed columns can be combined into a unique id for each row.

These are the minimum requirements for a maf being used in these post-processing commands.

However, some commands and their sub-commands may require additional criteria of the maf file. Additionally, they may also use specific rules in their processsing of the maf file.

For specifics on these criteria and rules, please find additional documentation on these commands below:

maf concat examples:

pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf
pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf -h header.txt where header.txt is a header file with names by which the mafs will be row-wise concatenated. See resources/header.txt for an example.
pv maf -p path/to/paths.txt -o output/path/file where path/to/paths.txt is a txt file with maf path locations. See resources/paths.txt for an example.

maf annotate examples:

pv maf mafbybed -m path/to/maf.maf -b path/to/maf.bed -o output/path/file -c annotation
pv maf annotate mafbytsv -m /path/to/maf.(tsv/csv/maf) -t path/to/tsv.tsv -sep tsv -oc hotspot -v "Yes" "No"

maf tag examples:

pv maf tag cmoch -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag germline_status -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag prevalence_in_cosmicDB -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf tag truncating_mut_in_TSG -m path/to/maf.maf -o output/path/file -sep "tsv"

maf filter examples:

pv maf filter cmo_ch -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter mappable -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter non_common_variant -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter non_hotspot -m path/to/maf.maf -o output/path/file -sep "tsv"
pv maf filter not_complex -m path/to/maf.maf -o output/path/file -sep "tsv"