Installation and Usage
If you have paired-end umi-tagged fastqs, you can run the ACCESS fastq to bam workflow with the following steps
Step 1: Create a virtual environment.
Option (A) - if using cwltool
If you are using cwltool only, please proceed using python 3.9 as done below:
Here we can use either virtualenv or conda. Here we will use conda.
conda create --name my_project python=3.9
conda activate my_project
Option (B) - recommended for Juno HPC cluster
If you are using toil, python 3 is required. Please install using Python 3.9 as done below:
Here we can use either virtualenv or conda. Here we will use conda.
conda create --name my_project python=3.9
conda activate my_project
Step 2: Clone the repository
git clone --recursive --branch 3.0.4 https://github.com/msk-access/nucleo.git
Step 3: Install requirements using pip
We have already specified the version of cwltool and other packages in the requirements.txt file. Please use this to install.
#python3
cd nucleo
pip3 install -r requirements.txt
Step 4: Check if you have singularity and nodejs for HPC
For HPC normally singularity is used for containers. Thus please make sure that is installed. For JUNO, you can do the following:
module load singularity
We also need to make sure nodejs is installed, this can be installed using conda:
conda install -c conda-forge nodejs
Step 5: Generate an inputs file
Next, you must generate a proper input file in either json or yaml format.
For details on how to create this file, please follow this example (there is a minimal example of what needs to be filled in at the end of the page):
Inputs DescriptionIt's also possible to create and fill in a "template" inputs file using this command:
$ cwltool --make-template nucleo.cwl > inputs.yaml
This may or may not work. We are not exactly sure why. But you can always use Rabix to generate the template input
Once we have successfully installed the requirements we can now run the workflow using cwltool/toil .
Step 6: Run the workflow
Your workflow should now be running on the specified batch system. See outputs for a description of the resulting files when is it completed.
Last updated
Was this helpful?