1 of 1

Run create_report.R

This script enables to run the create_report.R script on multiple patients

Requirements
run_create_report

Requirements

run_create_report

Main Script (run_create_report.py)

Wrapper script to run create_report.R

Arguments:

repo_path Path, optional - "Base path to where the git repository is located for access_data_analysis".
script_path Path, optional - "Path to the create_report.R script, fall back if --repo is not given".

Usage

Using Generate Markdown, copy facet maf file, use template_days RMarkdown, force flag and best fit for facets

Using Generate Markdown, force flag and default fit for facets

Submodules

check_required_columns

Check if all required columns are present in the sample manifest file

Arguments:

manifest data_frame - meta information file with information for each sample
template_days bool - True|False if template days RMarkdown will be used

Raises:

typer.Abort - if "cmo_patient_id" column not provided
typer.Abort - if "cmo_sample_id/sample_id" column not provided
typer.Abort

Returns:

list - column name for the manifest file
data_frame - data_frame with unique ids to traverse over

generate_repo_paths

generate_repo_path

Generate path to create_report.R and template RMarkdown file

Arguments:

repo_path pathlib.Path, optional - Path to clone of git repo access_data_analysis. Defaults to None.
script_path pathlib.Path, optional - Path to create_report.R. Defaults to None.

Raises:

typer.Abort - Abort if both repo_path and script_path are not given
typer.Abort - Abort if both repo_path and template_path are not given

Returns:

str - Path to create_report.R and path to template markdown file

read_manifest

Read manifest file

Arguments:

manifest pathlib.PATH - description

Returns:

data_frame - description

get_row

Function to skip rows

Arguments:

tsv_file file - file to be read

Returns:

list - lines to be skipped

get_small_variant_csv

Get the path to CSV file to be used for a given patient containing all variants

Arguments:

patient_id str - patient id used to identify the csv file
csv_path pathlib.path - base path where the csv file is expected to be present

Raises:

typer.Abort - if no csv file is returned
typer.Abort - if more then one csv file is returned

Returns:

str - path to csv file containing the variants

run_cmd

Given a system command run it using subprocess

Arguments:

cmd str - System command to be run as a string

run_multiple_cmd

Given a system command run it using subprocess

Arguments:

cmd list[str] - list of system commands to be run

generate_facet_maf_path

Get path of maf associated with facet-suite output

Arguments:

facet_path pathlib.PATH|str - path to search for the facet file
patient_id str - patient id to be used to search, default is set to None

Returns:

str - path of the facets maf

get_maf_path

Get the path to the maf file

Arguments:

maf_path pathlib.Path - Base path of the maf file
patient_id str: DMP Patient ID for facets
sample_id

Returns:

str - Path to the maf file

get_best_fit_folder

Get the best fit folder for the given facet manifest path

Arguments:

facet_manifest_path str - manifest path to be used for determining best fit

Returns:

pathlib.Path - path to the folder containing best fit maf files

generate_create_report_cmd

Create the system command that should be run for create_report.R

Arguments:

script str - path for create_report.R
markdown bool - True|False to generate markdown output
template_file

Returns:

cmd str - system command to run for create_report.R
html_output pathlib.Path - where the output file should be written

Run create_report.R

This script enables to run the create_report.R script on multiple patients

Requirements
run_create_report

Requirements

run_create_report

Main Script (run_create_report.py)

Wrapper script to run create_report.R

Arguments:

repo_path Path, optional - "Base path to where the git repository is located for access_data_analysis".
script_path Path, optional - "Path to the create_report.R script, fall back if --repo is not given".

Usage

Using Generate Markdown, copy facet maf file, use template_days RMarkdown, force flag and best fit for facets

Using Generate Markdown, force flag and default fit for facets

Submodules

check_required_columns

Check if all required columns are present in the sample manifest file

Arguments:

manifest data_frame - meta information file with information for each sample
template_days bool - True|False if template days RMarkdown will be used

Raises:

typer.Abort - if "cmo_patient_id" column not provided
typer.Abort - if "cmo_sample_id/sample_id" column not provided
typer.Abort

Returns:

list - column name for the manifest file
data_frame - data_frame with unique ids to traverse over

generate_repo_paths

generate_repo_path

Generate path to create_report.R and template RMarkdown file

Arguments:

repo_path pathlib.Path, optional - Path to clone of git repo access_data_analysis. Defaults to None.
script_path pathlib.Path, optional - Path to create_report.R. Defaults to None.

Raises:

typer.Abort - Abort if both repo_path and script_path are not given
typer.Abort - Abort if both repo_path and template_path are not given

Returns:

str - Path to create_report.R and path to template markdown file

read_manifest

Read manifest file

Arguments:

manifest pathlib.PATH - description

Returns:

data_frame - description

get_row

Function to skip rows

Arguments:

tsv_file file - file to be read

Returns:

list - lines to be skipped

get_small_variant_csv

Get the path to CSV file to be used for a given patient containing all variants

Arguments:

patient_id str - patient id used to identify the csv file
csv_path pathlib.path - base path where the csv file is expected to be present

Raises:

typer.Abort - if no csv file is returned
typer.Abort - if more then one csv file is returned

Returns:

str - path to csv file containing the variants

run_cmd

Given a system command run it using subprocess

Arguments:

cmd str - System command to be run as a string

run_multiple_cmd

Given a system command run it using subprocess

Arguments:

cmd list[str] - list of system commands to be run

generate_facet_maf_path

Get path of maf associated with facet-suite output

Arguments:

facet_path pathlib.PATH|str - path to search for the facet file
patient_id str - patient id to be used to search, default is set to None

Returns:

str - path of the facets maf

get_maf_path

Get the path to the maf file

Arguments:

maf_path pathlib.Path - Base path of the maf file
patient_id str: DMP Patient ID for facets
sample_id

Returns:

str - Path to the maf file

get_best_fit_folder

Get the best fit folder for the given facet manifest path

Arguments:

facet_manifest_path str - manifest path to be used for determining best fit

Returns:

pathlib.Path - path to the folder containing best fit maf files

generate_create_report_cmd

Create the system command that should be run for create_report.R

Arguments:

script str - path for create_report.R
markdown bool - True|False to generate markdown output
template_file

Returns:

cmd str - system command to run for create_report.R
html_output pathlib.Path - where the output file should be written

Usage: run_create_report.py [OPTIONS]

Options:
  -r, --repo PATH                 Base path to where the git repository is
                                  located for access_data_analysis

  -s, --script PATH               Path to the create_report.R script, fall
                                  back if `--repo` is not given

  -t, --template PATH             Path to the template.Rmd or
                                  template_days.Rmd to be used with
                                  create_report.R when `--repo` is not given

  -m, --manifest FILE             File containing meta information per sample.
                                  Require following columns in the header:
                                  cmo_patient_id, sample_id, dmp_patient_id,
                                  collection_date or collection_day,
                                  timepoint. If dmp_sample_id column is given
                                  and has information that will be used to run
                                  facets. If dmp_sample_id is not given and
                                  dmp_patient_id is given than it will be used
                                  to get the Tumor sample with lowest number.
                                  If dmp_sample_id or dmp_patient_id is not
                                  given then it will run without the facet maf
                                  file  [required]

  -v, --variant-results DIRECTORY
                                  Base path for all results of small variants
                                  as generated by filter_calls.R script in
                                  access_data_analysis (Make sure only High
                                  Confidence calls are included)  [required]

  -c, --cnv-results DIRECTORY     Base path for all results of CNV as
                                  generated by CNV_processing.R script in
                                  access_data_analysis  [required]

  -f, --facet-repo DIRECTORY      Base path for all results of facets on
                                  Clinical MSK-IMPACT samples  [default: /juno
                                  /work/ccs/shared/resources/impact/facets/all
                                  /]

  -bf, --best-fit                 If this is set to True then we will attempt
                                  to parse `facets_review.manifest` file to
                                  pick the best fit for a given dmp_sample_id
                                  [default: False]

  -l, --tumor-type TEXT           Tumor type label for the report  [required]
  -cfm, --copy-facet-maf          If this is set to True then we will copy the
                                  facet maf file in the directory specified in
                                  `copy_facet_dir`  [default: False]

  -cfd, --copy-facet-dir PATH     Directory path where the facet maf file
                                  should be copied.

  -d, --template-days             If the `--repo` option is specified and if
                                  this is set to True then we will use the
                                  template_days RMarkdown file as the template
                                  [default: False]

  -gm, --generate-markdown        If given, the create_report.R will be run
                                  with `-md` flag to generate markdown
                                  [default: False]

  -ff, --force                    If this is set to True then we will not stop
                                  if an error is encountered in a given sample
                                  while running create_report.R but keep on
                                  running for the next sample  [default:
                                  False]

  --install-completion            Install completion for the current shell.
  --show-completion               Show completion for the current shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.

Usage: run_create_report.py [OPTIONS]

Options:
  -r, --repo PATH                 Base path to where the git repository is
                                  located for access_data_analysis

  -s, --script PATH               Path to the create_report.R script, fall
                                  back if `--repo` is not given

  -t, --template PATH             Path to the template.Rmd or
                                  template_days.Rmd to be used with
                                  create_report.R when `--repo` is not given

  -m, --manifest FILE             File containing meta information per sample.
                                  Require following columns in the header:
                                  cmo_patient_id, sample_id, dmp_patient_id,
                                  collection_date or collection_day,
                                  timepoint. If dmp_sample_id column is given
                                  and has information that will be used to run
                                  facets. If dmp_sample_id is not given and
                                  dmp_patient_id is given than it will be used
                                  to get the Tumor sample with lowest number.
                                  If dmp_sample_id or dmp_patient_id is not
                                  given then it will run without the facet maf
                                  file  [required]

  -v, --variant-results DIRECTORY
                                  Base path for all results of small variants
                                  as generated by filter_calls.R script in
                                  access_data_analysis (Make sure only High
                                  Confidence calls are included)  [required]

  -c, --cnv-results DIRECTORY     Base path for all results of CNV as
                                  generated by CNV_processing.R script in
                                  access_data_analysis  [required]

  -f, --facet-repo DIRECTORY      Base path for all results of facets on
                                  Clinical MSK-IMPACT samples  [default: /juno
                                  /work/ccs/shared/resources/impact/facets/all
                                  /]

  -bf, --best-fit                 If this is set to True then we will attempt
                                  to parse `facets_review.manifest` file to
                                  pick the best fit for a given dmp_sample_id
                                  [default: False]

  -l, --tumor-type TEXT           Tumor type label for the report  [required]
  -cfm, --copy-facet-maf          If this is set to True then we will copy the
                                  facet maf file in the directory specified in
                                  `copy_facet_dir`  [default: False]

  -cfd, --copy-facet-dir PATH     Directory path where the facet maf file
                                  should be copied.

  -d, --template-days             If the `--repo` option is specified and if
                                  this is set to True then we will use the
                                  template_days RMarkdown file as the template
                                  [default: False]

  -gm, --generate-markdown        If given, the create_report.R will be run
                                  with `-md` flag to generate markdown
                                  [default: False]

  -ff, --force                    If this is set to True then we will not stop
                                  if an error is encountered in a given sample
                                  while running create_report.R but keep on
                                  running for the next sample  [default:
                                  False]

  --install-completion            Install completion for the current shell.
  --show-completion               Show completion for the current shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.

Run create_report.R

hashtagRequirements

hashtagrun_create_report

hashtagMain Script (run_create_report.py)

hashtagUsage

hashtagSubmodules

hashtagcheck_required_columns

hashtagcheck_required_columns

hashtaggenerate_repo_paths

hashtaggenerate_repo_path

hashtagread_manifest

hashtagread_manifest

hashtagget_row

hashtagget_small_variant_csv

hashtagget_small_variant_csv

hashtagrun_cmd

hashtagrun_cmd

hashtagrun_multiple_cmd

hashtaggenerate_facet_maf_path

hashtaggenerate_facet_maf_path

hashtagget_maf_path

hashtagget_best_fit_folder

hashtaggenerate_create_report_cmd

hashtaggenerate_create_report_cmd

Run create_report.R

hashtagRequirements

hashtagrun_create_report

hashtagMain Script (run_create_report.py)

hashtagUsage

hashtagSubmodules

hashtagcheck_required_columns

hashtagcheck_required_columns

hashtaggenerate_repo_paths

hashtaggenerate_repo_path

hashtagread_manifest

hashtagread_manifest

hashtagget_row

hashtagget_small_variant_csv

hashtagget_small_variant_csv

hashtagrun_cmd

hashtagrun_cmd

hashtagrun_multiple_cmd

hashtaggenerate_facet_maf_path

hashtaggenerate_facet_maf_path

hashtagget_maf_path

hashtagget_best_fit_folder

hashtaggenerate_create_report_cmd

hashtaggenerate_create_report_cmd

Requirements

run_create_report

Main Script (run_create_report.py)

Usage

Submodules

check_required_columns

check_required_columns

generate_repo_paths

generate_repo_path

read_manifest

read_manifest

get_row

get_small_variant_csv

get_small_variant_csv

run_cmd

run_cmd

run_multiple_cmd

generate_facet_maf_path

generate_facet_maf_path

get_maf_path

get_best_fit_folder

generate_create_report_cmd

generate_create_report_cmd

Requirements

run_create_report

Main Script (run_create_report.py)

Usage

Submodules

check_required_columns

check_required_columns

generate_repo_paths

generate_repo_path

read_manifest

read_manifest

get_row

get_small_variant_csv

get_small_variant_csv

run_cmd

run_cmd

run_multiple_cmd

generate_facet_maf_path

generate_facet_maf_path

get_maf_path

get_best_fit_folder

generate_create_report_cmd

generate_create_report_cmd