I'm aware that the small turnaround queue is not yet available, so I submitted a two container job (preprocess+train) using a small amount of images.
The problem is that this job is pending for over 24h, which leads me to believe something went wrong.
Can someone check what is going on?
snapshot with info below
[pending job](https://www.dropbox.com/s/bu4j13kp9z8649u/img6.png?dl=0)
Created by Jose Costa Pereira josecp Jose: We just added several additional servers to clear a backlog of submissions. Your pending submissions should now have been processed. We are also working on improved tools for tracking submissions. Thank you for your patience. Hello Bruce. This week I've made only two submissions.
The first one run, and I got a notification for the log file. Looking at the log file I was able to see that there was an error when trying to generate the LMDB files for caffe, i.e. when trying to execute './create_lmdb.sh':
```
(output omitted)
STDOUT: Image...hsqh5pls.jpeg ...Done.
STDOUT: Image...298uvj4j.jpeg ...Done.
STDOUT: Image...x9w1a39o.jpeg ...Done.
STDOUT: Image...8v6v1yx5.jpeg ...Done.
STDOUT: Image...fnkwwsm2.jpeg ...Done.
STDERR: /opt/caffe/.build_release/tools/convert_imageset: error while loading shared libraries: libcudart.so.6.5: cannot open shared object file: No such file or directory
STDOUT: Creating val lmdb...
```
Subsequently, I got a notification for job terminated. So this one must not be pending.
The second job I submitted this week (still showing here: https://www.synapse.org/#!Synapse:syn4224222/wiki/406828), is still pending (48 hours now). By the way, this job is mostly trying to solve the problem I detected with my previous submission: lookup where cuda's shared libraries are (libcudart.so.6.5).
I have not submitted anything else this week, and if there are any pending jobs from previous weeks ? not the one mentioned above ? you're welcome to terminate them. I would gladly do that if I could see them. For the only job I'm interested in right now, I gave it the suggestive name of find_libcuda.so (ID 7354918). I'm basically trying to reverse engineer where and what cuda libraries are installed on your systems.
If you can shed some light on this, I would be greatly appreciated.
PS- couldn't find what the "see below" comment in your previous post was referring to. Jose: You have created several submissions (see below). The last one is enqueued for processing. We are working to add more servers to process the backlog.
${leaderboard?path=%2Fevaluation%2Fsubmission%2Fquery%3Fquery%3Dselect%2B%2A%2Bfrom%2Bevaluation%5F7213944%2BWHERE%2BuserId%253D%253D%25223345451%2522&paging=true&queryTableResults=true&showIfLoggedInOnly=false&pageSize=100&showRowNumber=false&jsonResultsKeyName=rows&columnConfig0=epochdate%2C%2CcreatedOn%3B%2CNONE&columnConfig1=none%2C%2Cstatus%3B%2CNONE&columnConfig2=none%2C%2CobjectId%3B%2CNONE&columnConfig3=userid%2C%2CuserId%3B%2CNONE&columnConfig4=none%2C%2Cname%3B%2CNONE&columnConfig5=synapseid%2C%2CentityId%3B%2CNONE&columnConfig6=none%2C%2CversionNumber%3B%2CNONE&columnConfig7=epochdate%2C%2CcreatedOn%3B%2CNONE}