Hello,
how many CPU cores and how much memory will approximately be available for an individual docker container?
Kinde regards,
Dominik
Created by Dominik Otto djo Hi @brian.white,
Thank you ver much for the generous upgrade. I think we can work with that!
I can confirm locally, that our container runs on 8 cores in parallel, but not in the Challenge queue, yet. See https://www.synapse.org/#!Synapse:syn15589870/discussion/threadId=5784 for details.
Best,
Dominik Hi @djo and others who may have long-running jobs,
Thank you for the additional detail, Dominik.
Can you confirm that your code will be able to use multiple CPUs on one node?
Our current thinking is to allow each model access to 8 CPUs each with ~4 GBs per CPU (total of 31 GB on the node). The total number of samples in leaderboard round 1 is 332 (across 7 datasets) for round 2 is 322 (across 6 datasets). The validation data will be similarly sized. Please note that we expect batch effects across these datasets.
Am I understanding correctly that ~300 samples run on parallel across 8 CPUs would take 300 samples * 20 minutes / sample / 8 CPUs = 750 minutes or 12 1/2 hrs?
We will impose a 2-day time limit.
Does that seem reasonable?
Each round and each sub-challenge would run on a node with 48 CPUs and 188GB--hence allowing 6 submissions to run in parallel. Our expectation is that at most a few submissions will be long-running and that the rest will proceed relatively quickly through the queue.
Some other participants were interested in using raw FASTQs (and presumably doing alignments). We would be interested in hearing from others who have long-running jobs so that we can provision resources appropriately. We can revisit this after the first leaderboard round ... and, possibly, during the round if the need arises.
Thank you,
Brian To elaborate on our requirements:
We assume cell type characteristics are probability density functions of the whole expression vector featuring multivariate dependencies such as pairwise coexpressions. With the resulting model, we hope to introduce a novel approach to the competition. Computation of the model was only made feasible by heavy compression of ~50,000 genes down to less than 20 dimensions. I believe we cannot sacrifice any more accuracy without making the model predictions meaningless and hence cannot reduce time requirements of the algorithm under our base assumption.
Please consider my argument during the decision-making of time constrains. Hi @brian.white,
Thank you for the replay.
A runtime of half an hour for our algorithm is close to impossible to achieve. We apply [Automatic Differentiation Variational Inference](https://arxiv.org/abs/1603.00788) which is subject to a numerical optimization process. This is already a much faster alternative to the Hybrid Markov Chain Monte Carlo to explore the rather large parameter space. The algorithm can comfortably run several hours to achieve convergence and starts to produce reasonable results only after around 20 minutes per sample per CPU.
In light of previous announcements, we implemented a continuous update of the output such that even if the container is stopped, valid results will still be written. But premature abortion will not yield accurate predictions and samples that did not start will only contain zeros.
We could run all samples in parallel to ensure values are produced for each one but that requires up to 1 GB RAM per sample.
Could more resources be made available?
Kind regards,
Dominik Hi @djo ,
We are still resolving this ... we are thinking something along the lines of 1/2 hour max execution time, 4 CPUs and 16 GBs (per model execution). I know you have been corresponding with @andrewelamb elsewhere about this. Let's move that thread here.
Does this seem reasonable? It may not be if you are doing alignment. Are you? Note that our validation data will have at least 100 samples.
Thanks,
Brian @andrewelamb thank you for the replay! @djo
There will be an announcement soon about this.