Greetings.
I've set up a VM on Google (from scratch starting from a base ubuntu image) and I'm trying to run the example RSEM workflow via:
SMC-RNA-Challenge/script/dream_runner.py \
test dryrun1 \
SMC-RNA-Challenge/SMC-RNA-Examples/workflow/smcIsoform-rsem-workflow.cwl isoform
This essentially ends up running the following:
cwl-runner --cachedir cwl-cache \
SMC-RNA-Challenge/SMC-RNA-Examples/workflow/smcIsoform-rsem-workflow.cwl \
dream_runner_input_rtdTnE.json
The process ends up running ok up until it tries to actually run RSEM, where it then fails with the error message:
Could not locate a Bowtie index corresponding to basename "/var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37"
After digging into the internals, I think I figured out what's causing the problem. The RSEM/Bowtie index provided by somewhat secretively by the dream_runner.py is found as ~/.synapseCache/327/8741327/rsem_index.tar.gz. Part of the rsem cwl pipeline involves untar'ing this file and caching the results in the cwl-cache/... directory. This tar sub-worfklow (tar.cwl) provides a list of all the untar'd output files as input to the next sub-worfklow running rsem (rsem.cwl). When the rsem cwl is executed, it launches the Docker container containing rsem and mounts all the various untar'd and cached files as separate volumes. When the bowtie command runs from within the rsem docker, it needs to find all the index files in a single same directory, and it's not able to because all of the individual untar'd index files are mounted as separate files. Ideally, there would be a way to just mount the directory containing all the index files and to pass this directory name as an input parameter to the rsem workflow.
Would someone please advise on how to move forward with this?
The entire output and error messages are shown below. Thanks in advance!
============== output below ====================
```
/usr/local/bin/cwl-runner 1.0.20160726135535
Unknown hint file:///home/bhaas/SMC-RNA-Challenge/SMC-RNA-Examples/workflow/synData
[job tar] Output of job will be cached in /home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4
[job tar] /home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4$ tar \
xvzf \
/tmp/tmpaTnttj/stg4432c7ab-1114-4338-8fe3-2c36fc7515b7/rsem_index.tar.gz
rsem_references/
rsem_references/GRCh37.n2g.idx.fa
rsem_references/GRCh37.chrlist
rsem_references/GRCh37.seq
rsem_references/GRCh37.transcripts.fa
rsem_references/GRCh37.3.ebwt
rsem_references/GRCh37.rev.1.ebwt
rsem_references/GRCh37.2.ebwt
rsem_references/GRCh37.rev.2.ebwt
rsem_references/GRCh37.idx.fa
rsem_references/GRCh37.ti
rsem_references/GRCh37.grp
rsem_references/GRCh37.1.ebwt
rsem_references/GRCh37.4.ebwt
[step tar] completion status is success
[job gunzip2] Output of job will be cached in /home/bhaas/cwl-cache/b99265d69c2341207ce40de4030869fe
[job gunzip2] /home/bhaas/cwl-cache/b99265d69c2341207ce40de4030869fe$ gunzip \
-c \
/tmp/tmpgaEiOc/stg85aae06a-8eb2-4d2e-a5c5-83ad6b74033a/dryrun1_mergeSort_2.fq.gz > /home/bhaas/cwl-cache/b99265d69c2341207ce40de4030869fe/dryrun1_mergeSort_2.fq
[step gunzip2] completion status is success
[job gunzip1] /home/bhaas/cwl-cache/637576ca668fda3c0716b6772ebf1dd8$ gunzip \
-c \
/tmp/tmp4SC_fS/stgec40f5ea-a1e5-4be6-a698-c44828affd06/dryrun1_mergeSort_1.fq.gz > /home/bhaas/cwl-cache/637576ca668fda3c0716b6772ebf1dd8/dryrun1_mergeSort_1.fq
[step gunzip1] completion status is success
[job rsem] /home/bhaas/cwl-cache/3db90522ae21d8c92f9ed1bb202c3cb1$ docker \
run \
-i \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.1.ebwt:/var/lib/cwl/stg67e7b317-b1f9-47d3-9c19-8dc1ef193c34/GRCh37.1.ebwt:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.ti:/var/lib/cwl/stge23702bd-2698-4519-b0ba-f644848f2bc4/GRCh37.ti:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.grp:/var/lib/cwl/stgab302161-835e-4ab8-b135-03681e842400/GRCh37.grp:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.2.ebwt:/var/lib/cwl/stgeaffa9cc-393b-4660-9359-037dbd2f248c/GRCh37.2.ebwt:ro \
--volume=/home/bhaas/cwl-cache/b99265d69c2341207ce40de4030869fe/dryrun1_mergeSort_2.fq:/var/lib/cwl/stg3ba3ac6b-cfd8-4623-9276-863a030be1d4/dryrun1_mergeSort_2.fq:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.chrlist:/var/lib/cwl/stg3065b565-d886-4c57-a20a-696c9257f251/GRCh37.chrlist:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.idx.fa:/var/lib/cwl/stg50e650da-be0f-4aa7-917a-10e008b16406/GRCh37.idx.fa:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.seq:/var/lib/cwl/stg9eb9c5f4-f445-49cf-8005-de4fde47ca96/GRCh37.seq:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.3.ebwt:/var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37.3.ebwt:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.rev.1.ebwt:/var/lib/cwl/stg03a6732a-1ddc-4652-aadf-db769d94482f/GRCh37.rev.1.ebwt:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.n2g.idx.fa:/var/lib/cwl/stgc6b3e971-3a97-4dd5-9bae-8b22fabd72c3/GRCh37.n2g.idx.fa:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.transcripts.fa:/var/lib/cwl/stg37eb0850-e519-4640-8b1a-dfa7aff3f2f3/GRCh37.transcripts.fa:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.rev.2.ebwt:/var/lib/cwl/stg84df3cc6-143c-4377-a760-f16d72677565/GRCh37.rev.2.ebwt:ro \
--volume=/home/bhaas/cwl-cache/af32f8229ab806905775aa40754adcf4/rsem_references/GRCh37.4.ebwt:/var/lib/cwl/stg04d9492a-ab4b-4a5f-ae97-dd97336ba7ce/GRCh37.4.ebwt:ro \
--volume=/home/bhaas/cwl-cache/637576ca668fda3c0716b6772ebf1dd8/dryrun1_mergeSort_1.fq:/var/lib/cwl/stg96700e7b-4798-4f61-b31a-a3fdb5577add/dryrun1_mergeSort_1.fq:ro \
--volume=/home/bhaas/cwl-cache/3db90522ae21d8c92f9ed1bb202c3cb1:/var/spool/cwl:rw \
--volume=/tmp/tmpDXk50P:/tmp:rw \
--workdir=/var/spool/cwl \
--read-only=true \
--user=1001 \
--rm \
--env=TMPDIR=/tmp \
--env=HOME=/var/spool/cwl \
dreamchallenge/rsem \
rsem-calculate-expression \
--paired-end \
--strand-specific \
-p \
8 \
/var/lib/cwl/stg96700e7b-4798-4f61-b31a-a3fdb5577add/dryrun1_mergeSort_1.fq \
/var/lib/cwl/stg3ba3ac6b-cfd8-4623-9276-863a030be1d4/dryrun1_mergeSort_2.fq \
/var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37 \
rsemOut
bowtie -q --phred33-quals -n 2 -e 99999999 -l 25 -I 1 -X 1000 --norc -p 8 -a -m 200 -S /var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37
-1 /var/lib/cwl/stg96700e7b-4798-4f61-b31a-a3fdb5577add/dryrun1_mergeSort_1.fq -2 /var/lib/cwl/stg3ba3ac6b-cfd8-4623-9276-863a030be1d4/dryrun1_merg
eSort_2.fq | samtools view -S -b -o rsemOut.temp/rsemOut.bam -
**Could not locate a Bowtie index corresponding to basename "/var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37"**
Command: bowtie -q --phred33-quals -n 2 -e 99999999 -l 25 -I 1 -X 1000 --norc -p 8 -a -m 200 -S -1 /var/lib/cwl/stg96700e7b-4798-4f61-b31a-a3fdb557
7add/dryrun1_mergeSort_1.fq -2 /var/lib/cwl/stg3ba3ac6b-cfd8-4623-9276-863a030be1d4/dryrun1_mergeSort_2.fq /var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-
9c87c1020888/GRCh37
rsem-parse-alignments /var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37 rsemOut.temp/rsemOut rsemOut.stat/rsemOut rsemOut.temp/rsemOut.ba
m 3 -tag XM
Cannot open /var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37.grp! It may not exist.
"rsem-parse-alignments /var/lib/cwl/stg306ef69b-4356-4fe2-9cb4-9c87c1020888/GRCh37 rsemOut.temp/rsemOut rsemOut.stat/rsemOut rsemOut.temp/rsemOut.b
am 3 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!
Error while running job: Error collecting output for parameter 'output': Did not find output file with glob pattern: '[u'rsemOut.isoforms.results']
'
[job rsem] completed permanentFail
Output is missing expected field file:///home/bhaas/SMC-RNA-Challenge/SMC-RNA-Examples/workflow/smcIsoform-rsem-workflow.cwl#rsem/output
[step rsem] completion status is permanentFail
Workflow error, try again with --debug for more information:
Output for workflow not available
```
Created by Brian Haas bhaas Using
cwlVersion: "v1.0"
and having tar.cwl have a 'Directory' type as output, and having rsem.cwl specify the index as 'Directory' type (instead of a file array in both cases) solves this problem for me.
~brian
Drop files to upload
Problem running RSEM example workflow on google page is loading…