I just recently happened to submit one job repeatedly. This means the two submissions are exactly identical. I immediately canceled one within 10 mins to avoid wasting time quota. However, I found an interesting thing by analyzing the two log files. One submission reported an IOError during python import, but the other one didn't. (BTW, seems this IO error is not so lethal, and both continued to run my program. ) Since the two submission are identical, I am just wondering how this can happen? The only factor that might be different is the underlying server. I will appreciate it if someone can provide some ideas on this problem? @thomas.yu @tschaffter @brucehoff       I attached the first several lines for both log files:   Log file1: **with IOError**   STDOUT: Fri Jan 20 18:52:46 UTC 2017 STDOUT: copy val.lst to /scratch folder STDOUT: done STDOUT: total 116K STDOUT: -rw-r--r--. 1 root root 37K Jan 20 18:52 image_list.txt STDOUT: -rw-r--r--. 1 root root 73K Jan 20 18:52 val.lst STDOUT: total 120K STDOUT: -rw-r--r--. 1 root root 37K Jan 20 18:52 image_list.txt STDOUT: drwxr-xr-x. 2 root root 4.0K Jan 20 18:52 images_crop STDOUT: -rw-r--r--. 1 root root 73K Jan 20 18:52 val.lst STDOUT: Fri Jan 20 18:52:46 UTC 2017 STDOUT: start to preprocess the .dcm to png STDERR: parallel: Warning: $SHELL not set. Using /bin/sh. STDERR: Traceback (most recent call last): STDERR: File "main_dcm_crop_png.py", line 6, in STDERR: from pylab import * STDERR: File "/usr/lib/pymodules/python2.7/pylab.py", line 1, in STDERR: from matplotlib.pylab import * STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/pylab.py", line 226, in STDERR: import matplotlib.finance STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/finance.py", line 23, in STDERR: from matplotlib.collections import LineCollection, PolyCollection STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/collections.py", line 23, in STDERR: import matplotlib.backend_bases as backend_bases STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/backend_bases.py", line 50, in STDERR: import matplotlib.textpath as textpath STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/textpath.py", line 11, in STDERR: import matplotlib.font_manager as font_manager STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 1356, in STDERR: _rebuild() STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 1343, in _rebuild STDERR: pickle_dump(fontManager, _fmcache) STDERR: File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 939, in pickle_dump STDERR: with open(filename, 'wb') as fh: **STDERR: IOError: [Errno 2] No such file or directory: '/tmp/matplotlib-root/fontList.cache'** STDOUT: Convert /trainingData/495822.dcm to /scratch/images_crop/495822.png STDOUT: AcqTime: ; Manu: HOLOGIC, Inc.; Reso: ['0.0700', '0.0700']; Rows: 3328; Cols: 2560 STDERR: libdc1394 error: Failed to initialize libdc1394 STDOUT: Convert /trainingData/494052.dcm to /scratch/images_crop/494052.png STDOUT: AcqTime: ; Manu: HOLOGIC, Inc.; Reso: ['0.0700', '0.0700']; Rows: 3328; Cols: 2560   Log file2: **No IOError**   STDOUT: Fri Jan 20 18:54:16 UTC 2017 STDOUT: copy val.lst to /scratch folder STDOUT: done STDOUT: total 116K STDOUT: -rw-r--r--. 1 root root 37K Jan 20 18:54 image_list.txt STDOUT: -rw-r--r--. 1 root root 73K Jan 20 18:54 val.lst STDOUT: total 120K STDOUT: -rw-r--r--. 1 root root 37K Jan 20 18:54 image_list.txt STDOUT: drwxr-xr-x. 2 root root 4.0K Jan 20 18:54 images_crop STDOUT: -rw-r--r--. 1 root root 73K Jan 20 18:54 val.lst STDOUT: Fri Jan 20 18:54:16 UTC 2017 STDOUT: start to preprocess the .dcm to png STDERR: parallel: Warning: $SHELL not set. Using /bin/sh. STDOUT: Convert /trainingData/495847.dcm to /scratch/images_crop/495847.png STDOUT: AcqTime: ; Manu: HOLOGIC, Inc.; Reso: ['0.0700', '0.0700']; Rows: 3328; Cols: 2560 STDERR: libdc1394 error: Failed to initialize libdc1394 STDOUT: Convert /trainingData/495818.dcm to /scratch/images_crop/495818.png STDOUT: AcqTime: ; Manu: HOLOGIC, Inc.; Reso: ['0.0700', '0.0700']; Rows: 3328; Cols: 2560 STDERR: libdc1394 error: Failed to initialize libdc1394

Created by Bibo Shi darrylbobo
> The only factor that might be different is the underlying server. Not true. You could have a race condition in your code. https://en.wikipedia.org/wiki/Race_condition

Two exactly same submission, but one report ERR page is loading…