Get cBioPortal Variants
Script to subset record from cBioPortal format files
Table of Contents
get_cbioportal_variants
Requirement:
pandas
typing
typer
bed_lookup(https://github.com/msk-access/python_bed_lookup)
Example command
subset_cpt
subset_cst
subset_cna
subset_sv
subset_maf
Sub-modules
read_tsv
Read a tsv file
Arguments:
mafFile - Input MAF/tsv like format file
Returns:
data_frame- Output a data frame containing the MAF/tsv
read_ids
make a list of ids
Arguments:
sidtuple - Multiple ids as tupleidsFile - File containing multiple ids
Returns:
list- List containing all ids
filter_by_columns
Filter data by columns
Arguments:
sidlist - list of columns to subset overtsv_dfdata_frame - data_frame to subset from
Returns:
data_frame- A copy of the subset of the data_frame
filter_by_rows
Filter the data by rows
Arguments:
sidlist - list of row names to subset overtsv_dfdata_frame - data_frame to subset fromcol_namestring - name of the column to filter using names in the sid
Returns:
data_frame- A copy of the subset of the data_frame
read_bed
Read BED file using bed_lookup
Arguments:
bedfile - File ins BED format to read
Returns:
object : bed file object to use for filtering
check_if_covered
Function to check if a variant is covered in a given bed file
Arguments:
bedObjobject - BED file object to check coveragemafObjdata_frame - data frame to check coverage against coordinates using column 'Chromosome' and position column is 'Start_Position'
Returns:
data_frame- description
get_row
Function to skip rows
Arguments:
tsv_filefile - file to be read
Returns:
list- lines to be skipped
Last updated
Was this helpful?