Installation and Usage
How to install and run the workflow
Step 1: Create a virtual environment.
Step 1: For cwltool/toil, please install using python 3.6 as done below:
Here we can use either virtualenv or conda. Here we will use virtualenv.
pip3 install virtualenv
python3 -m venv my_project
source my_project/bin/activate
Step 2: Clone the repository
git clone --recursive https://github.com/msk-access/uncollapsed_bam_generation.git
cd standard_bam_processing
git submodule update --recursive --remote
Step 3: Install requirements using pip
We have already specified the version of cwltool and other packages in the requirements.txt file. Please use this to install.
#python2
pip install -r requirements.txt
#python3
pip3 install -r requirements.txt
Once we have successfully installed the requirements we can now run the workflow using cwltool/toil if you have proper input file generated either in json or yaml format. Please look at Inputs Description for more details.
Here we show how to use cwltool to run the workflow on single machine
Step 4: Run the workflow with a given set of input using cwltool on single machine
cwltool uncollapsed_bam_generation.cwl inputs.yaml
Here we show how to run the workflow using toil-cwl-runner using single machine interface.
Once we have successfully installed the requirements we can now run the workflow using cwltool if you have proper input file generated either in json or yaml format. Please look at Inputs Description for more details.
Step 4: Run the workflow with a given set of input using toil on single machine
toil-cwl-runner uncollapsed_bam_generation.cwl inputs.yaml
Here we show how to run the workflow using toil-cwl-runner on MSKCC internal compute cluster called JUNO which has IBM LSF as a scheduler.
Step 4: Run the workflow with a given set of input using toil on JUNO (MSKCC Research Cluster)
TMPDIR=$PWD
TOIL_LSF_ARGS='-W 3600'
toil-cwl-runner \
--singularity \
--logFile /path/to/toil_log/cwltoil.log \
--jobStore /path/to/jobStore \
--batchSystem lsf \
--workDir /path/to/toil_log \
--outdir $PWD \
--writeLogs /path/to/toil_log \
--logLevel DEBUG \
--stats \
--retryCount 2
--disableCaching \
--disableChaining \
--maxLogFileSize 20000000000 \
--cleanWorkDir onSuccess
--preserve-environment TOIL_LSF_ARGS TMPDIR \
/path/to/uncollapsed_bam_generation.cwl \
/path/to/inputs.yaml \
> toil.stdout \
2> toil.stderr &
You should now be running the workflow on the specified batch system
Last updated
Was this helpful?