Run create_report.R
This script enables to run the create_report.R script on multiple patients
Last updated
Was this helpful?
This script enables to run the create_report.R script on multiple patients
Last updated
Was this helpful?
Wrapper script to run create_report.R
Arguments:
repo_path
Path, optional - "Base path to where the git repository is located for access_data_analysis".
script_path
Path, optional - "Path to the create_report.R script, fall back if --repo
is not given".
template_path
Path, optional - "Path to the template.Rmd or template_days.Rmd to be used with create_report.R when --repo
is not given".
manifest
Path, required - "File containing meta information per sample. Require following columns in the header: cmo_patient_id
, sample_id
, dmp_patient_id
, collection_date
or collection_day
, timepoint
. If dmp_sample_id column is given and has information that will be used to run facets. if dmp_sample_id is not given and dmp_patient_id is given than it will be used to get the Tumor sample with lowest number.If dmp_sample_id or dmp_patient_id is not given then it will run without the facet maf file".
variant_path
Path, required - "Base path for all results of small variants as generated by filter_calls.R script in access_data_analysis (Make sure only High Confidence calls are included)".
cnv_path
Path, required - "Base path for all results of CNV as generated by CNV_processing.R script in access_data_analysis".
facet_repo
Path, required - "Base path for all results of facets on Clinical MSK-IMPACT samples".
best_fit
bool, optional - "If this is set to True then we will attempt to parse facets_review.manifest
file to pick the best fit for a given dmp_sample_id".
tumor_type
str, required - "Tumor type label for the report".
copy_facet
bool, optional - "If this is set to True then we will copy the facet maf file in the directory specified in copy_facet_dir
".
copy_facet_dir
Path, optional - "Directory path where the facet maf file should be copied.".
template_days
bool, optional - "If the --repo
option is specified and if this is set to True then we will use the template_days RMarkdown file as the template".
markdown
bool, optional - "If given, the create_report.R will be run with -md
flag to generate markdown".
force
bool, optional - "If this is set to True then we will not stop if an error is encountered in a given sample but keep on running for the next sample".
Using Generate Markdown, copy facet maf file, use template_days RMarkdown, force flag and best fit for facets
Using Generate Markdown, force flag and default fit for facets
Check if all required columns are present in the sample manifest file
Arguments:
manifest
data_frame - meta information file with information for each sample
template_days
bool - True|False if template days RMarkdown will be used
Raises:
typer.Abort
- if "cmo_patient_id" column not provided
typer.Abort
- if "cmo_sample_id/sample_id" column not provided
typer.Abort
- if "dmp_patient_id" column not provided
typer.Abort
- if "collection_date/collection_day" column not provided
typer.Abort
- if "timepoint" column not provided
Returns:
list
- column name for the manifest file
data_frame
- data_frame with unique ids to traverse over
Generate path to create_report.R and template RMarkdown file
Arguments:
repo_path
pathlib.Path, optional - Path to clone of git repo access_data_analysis. Defaults to None.
script_path
pathlib.Path, optional - Path to create_report.R. Defaults to None.
template_path
pathlib.Path, optional - Path to template RMarkdown file. Defaults to None.
template_days
bool, optional - True|False to use days template if using repo_path. Defaults to None.
Raises:
typer.Abort
- Abort if both repo_path and script_path are not given
typer.Abort
- Abort if both repo_path and template_path are not given
Returns:
str
- Path to create_report.R and path to template markdown file
Read manifest file
Arguments:
manifest
pathlib.PATH - description
Returns:
data_frame
- description
Function to skip rows
Arguments:
tsv_file
file - file to be read
Returns:
list
- lines to be skipped
Get the path to CSV file to be used for a given patient containing all variants
Arguments:
patient_id
str - patient id used to identify the csv file
csv_path
pathlib.path - base path where the csv file is expected to be present
Raises:
typer.Abort
- if no csv file is returned
typer.Abort
- if more then one csv file is returned
Returns:
str
- path to csv file containing the variants
Given a system command run it using subprocess
Arguments:
cmd
str - System command to be run as a string
Given a system command run it using subprocess
Arguments:
cmd
list[str] - list of system commands to be run
Get path of maf associated with facet-suite output
Arguments:
facet_path
pathlib.PATH|str - path to search for the facet file
patient_id
str - patient id to be used to search, default is set to None
sample_id
str - sample id to be used to search, default is set to None
Returns:
str
- path of the facets maf
Get the path to the maf file
Arguments:
maf_path
pathlib.Path - Base path of the maf file
patient_id
str: DMP Patient ID for facets
sample_id
str - DMP Sample ID if any for facets
Returns:
str
- Path to the maf file
Get the best fit folder for the given facet manifest path
Arguments:
facet_manifest_path
str - manifest path to be used for determining best fit
Returns:
pathlib.Path
- path to the folder containing best fit maf files
Create the system command that should be run for create_report.R
Arguments:
script
str - path for create_report.R
markdown
bool - True|False to generate markdown output
template_file
str - path for the template file
cmo_patient_id
str - patient id from CMO
csv_file
str - path to csv file containing variant information
tumor_type
str - tumor type label
manifest
pathlib.Path - path to the manifest containing meta data
cnv_path
pathlib.Path - path to directory having cnv files
dmp_patient_id
str - patient id of the clinical msk-impact sample
dmp_sample_id
str - sample id of the clinical msk-impact sample
dmp_facet_maf
str - path to the clinical msk-impact maf file annotated for facets results
Returns:
cmd
str - system command to run for create_report.R
html_output
pathlib.Path - where the output file should be written