CLI Quick Start
This guide shows you how to get started with py-gbcms using the standalone CLI for processing one or a few samples.
Processing many samples? Use the Nextflow Workflow instead for automatic parallelization on HPC clusters.
Prerequisites
Python >= 3.10
Rust toolchain (for installation from source)
BAM files with index (.bai)
Reference FASTA with index (.fai)
Variants file (VCF or MAF)
Install via pip install py-gbcms or see the project README for detailed setup instructions.
Basic Usage
Single Sample
Count variants for one sample:
gbcms run \
--variants variants.vcf \
--bam sample1.bam \
--fasta reference.fa \
--output-dir results/Output: results/sample1.vcf
Multiple Samples
Process multiple samples sequentially:
Or use a BAM list file:
Note: The CLI processes samples sequentially. For parallel processing of many samples, use the Nextflow Workflow.
Common Options
Output Format
VCF (default):
MAF:
Custom Sample IDs
Override the sample name:
Output: results/MySampleID.vcf
Output Suffix
Add suffix to output filenames:
Output: results/sample.genotyped.vcf
Threading
Use multiple threads for processing:
Quality Filters
Minimum mapping quality:
Minimum base quality:
Filter duplicates (default: enabled):
Filter secondary alignments:
Complete Example
Process a sample with strict filtering:
Output: genotyped_results/TumorSample.genotyped.vcf
Using Docker
Run via Docker container:
Next Steps
Many samples on HPC: See Nextflow Workflow
Usage patterns: See Usage Overview
Last updated