FAQ

Are CWL Directory inputs and outputs supported by Beagle?

Yes, but to use any of the files inside the directories in downstream workflows, they should also be referenced as separate outputs of type File

How can I use the files from one pipeline as inputs to another?

Query the FileRepository for matching File objects, and use them in the Operator as inputs:

<code example>

Why are there multiple run_ids associated with an instance of an operator?

This is because one operator can produce multiple runs. If an operator is connected to a second operator, then the second one will either aggregate all runs produced by the first operator and put them in the run_ids list (aggregate), or it will produce a second operator run for each of the runs from the first operator (foreach). Aggregate or ForEach is specified in operator triggers.

Can I set arbitrary metadata to be associated with a run so that it can be queried after pipeline completion?

Yes, use the “tags” field of the APICreateRunSerializer to add this information. It will be stored along with the successful run.

How can I understand what is currently in the Rabbit Queue?

Use the RabbitMQ admin panel <link> You can view and clear queues from here.

How can I go from a Port object to the metadata associate with a File that it represents?

The port has a field called db_value. Inside the location field there is a UUID which can be used to query for the File itself.

How can I save specific metadata about a pipeline output after the run is successful?

This is a common use case. If your pipeline is written in CWL, then all outputs of type Record will be stored as a single output Port. Thus you can have your pipeline output Record objects, with separate fields for the File, and any other information that it should be associated with. If not using CWL, you can use the “tags” field of the APICreateRunSerializer to include this metadata, however, note that it will be associated with the run as a whole, and cannot be linked specifically to individual output files.

What would be the best way to query for all files that have come from a specific pipeline?

Query for Pipeline -> Runs -> Ports -> Files

Port.objects.filter(run__app__name=”pipeline_name”)

I’m seeing an error in the logs “Failed to create run. Failed to resolve CWL Command“

Probably this CWL is invalid, try using the “cwltool --validate” command to resolve any CWL errors

I’m seeing an error “fatal: could not read Username for 'https://github.com': No such device or address'”

Try using the “git@” instead of “https://” for Beagle to find your pipeline in GitHub

What is the difference between the “path” and “location” fields, and when should each be used?

A: ?

MSK-specific FAQ for JUNO

Q: Where are beagle and ridgeback logs located?

They are here:

Staging Beagle: /srv/services/staging_voyager/logs

Staging Ridgeback: /srv/services/staging_voyager/celery/ridgeback_worker.log

Prod Beagle:

Prod Ridgeback:

Q: How do I access staging / production beagle / ridgeback?

Include host / port / user / pass for each environment

Last updated