Hi,
I'm looking to take advantage of the local storage of pre-processed data, and I was wondering if there is a way/command to run multiple pre-processing scripts in parallel on multiple CPUs on your servers. Ex. I have a pre-processing script I would like to run on each individual training image, so I have as many pre-processing scripts as there are images. I'd estimate they each take about 1 hour to run. Can this be parallelized, or should I just run each script individually?
Thanks!
Created by Luli Zou luli.zou I can recommend you two approaches: GNU Parallel and xargs.
Here is an example on how to convert all the DICOM images in a directory to PNG images using [GNU Parallel](https://www.gnu.org/software/parallel/) and [ImageMagick](http://www.imagemagick.org/script/index.php):
```bash
find /data/ -name "*.dcm" | parallel 'convert {} {/.}.png'
```
"{}" and "{/.}" are placeholders defined by GNU Parallel:
{}: absolute path to a DICOM image (e.g. "/data/image.dcm")
{/.}: "/" means "without the path to the directory and "." means "without the extension" (if {} represents "/data/image.dcm" then {/.} is set to "image")
By default, GNU Parallel use all the CPU cores available.
You can also place a list of commands in a text file (one per line) and use the following command to run them in parallel:
```bash
parallel < commands.txt
```
See the [GNU Parallel Tutorial](https://www.gnu.org/software/parallel/parallel_tutorial.html) for more detailed information.