Hi, I have run my preprocessing in the second round and store my preprocessing result in cache and now when I run my training script, it said the all of the previous preprocessing file does not exist. However, I use ``` ls -l preprocessing ``` in train.sh to list all the file in the preprocessing directory. The result shows the missing file **does** exist. Is anyone can help me with this issue ? My submission ID is 8264686

Created by Dan Chen Error202
> However, still I think it is a bug? Yes, your code may be buggy. Best of luck in the challenge.
Hi @brucehoff, Thanks for you quick replying, after putting ''/'', my program works. I use relative path because I do not put my program in the root directory in my computer and sometimes I want to test it in my computer. However, still I think it is a bug? because even it is a relative path, it should be work as well Dan
Since the files are clearly mounted (and other users have no problem accessing their mounted preprocessing data) I feel strongly that the problem is in your code. In your current submission the file `run_cnn_k_mil_new.py` uses relative paths: ``` ... preprocesspath = 'preprocessedData/' img_fnames = [x for x in os.listdir(preprocesspath) if x.endswith(img_ext)] print('number of pickle files: ' + str(len(img_fnames))) labelpath = 'preprocessedData/metadata/image_labels.txt' .. ``` I don't know why you insist on using relative paths rather than absolute ones. Simply put `/` in front of your path definitions, e.g., change 'preprocessedData/' to '/preprocessedData/'.
I can inspect your running container using 'docker exec' and verify that a specific file is correctly mounted: ``` docker exec e98d01acef14 ls -al /preprocessedData/510577227.pickle -rw-r--r--. 1 root root 134880 Feb 2 20:11 /preprocessedData/510577227.pickle ```
I see that you are running another submission right now (ID 8309950), using the preprocessed data. The 'docker inspect' command shows that the folder containing your preprocessed data, `/data/dataset4` is correctly mounted as `/preprocessedData`. ``` ubuntu@ip-172-31-59-223:~$ docker inspect e98d01acef14 [ { "Id": "e98d01acef14711c3f7a708671206f5d04f8d6552cc92346f393201a6b238316", ... "Mounts": [ ... { "Type": "bind", "Source": "/data/dataset4", "Destination": "/preprocessedData", "Mode": "ro", "RW": false, "Propagation": "" }, ... ], ... } ] ``` ${leaderboard?path=%2Fevaluation%2Fsubmission%2Fquery%3Fquery%3Dselect%2B%2A%2Bfrom%2Bevaluation%5F7213944%2Bwhere%2BSUBMITTER%253D%253D%25223350786%2522%2BAND%2BcreatedOn%253E%253D1483401600000&paging=true&queryTableResults=true&showIfLoggedInOnly=false&pageSize=100&showRowNumber=false&jsonResultsKeyName=rows&columnConfig0=none%2CSubmission ID%2CobjectId%3B%2CDESC&columnConfig1=none%2CStatus%2Cstatus%3B%2CNONE&columnConfig2=none%2CStatus Detail%2CSTATUS%5FDESCRIPTION%3B%2CNONE&columnConfig3=epochdate%2CSubmitted On%2CcreatedOn%3B%2CNONE&columnConfig4=epochdate%2CLast Updated%2CTRAINING%5FLAST%5FUPDATED%3B%2CNONE&columnConfig5=synapseid%2CSubmitted Repository or File%2CentityId%3B%2CNONE&columnConfig6=none%2CFile Version%2CversionNumber%3B%2CNONE&columnConfig7=synapseid%2CLog Folder%2CSUBMISSION%5FFOLDER%3B%2CNONE&columnConfig8=none%2CSubmitting User or Team%2CSUBMITTER%3B%2CNONE}
> the absolute path is /preprocessedData/510577227.pickle That is surprising! > do you think re-runni?g my preprocess script will work? No, the preprocessed data is there: I see 317619 files, including `510577227.pickle`. ``` [dreamuser@bm15 ~]$ cd /data/dataset4 [dreamuser@bm15 dataset4]$ ls -1 | wc -l 317619 [dreamuser@bm15 dataset4]$ ls -al 510577227.pickle -rw-r--r--. 1 root root 134880 Feb 2 14:11 510577227.pickle ``` The question is why your code does not seem to find it.   I see that it took 71 hours to compute this preprocessed data, so we don't want to regenerate the files. (Also I would not expect that regenerating the files would solve the problem.)
hi @brucehoff the absolute path is /preprocessedData/510577227.pickle
Try printing ``` os.path.abspath(filepath) ``` https://docs.python.org/2/library/os.path.html
Hi, @brucehoff I also print all the file in main directory, preprocessedData is in current directory. Dan
Since your file path 'preprocessedData/510577227.pickle' does not start with '/' then it is a relative, not an absolute file path. Since it is a relative file path then the absolute path is the concatenation of the current directory with the relative path. Say you run your code after changing the current directory to '/foo/bar'. Then when you try to access 'preprocessedData/510577227.pickle' you will be looking in '/foo/bar/preprocessedData/510577227.pickle', which may not exist. (It certainly is not the preprocessing directory.) Have you checked what is the current directory at the point your 'path check'? Can you use absolute file paths rather than relative paths in your code?
Hi, @brucehoff @thomas.yu yeah, actually this contradictory is my problem. in train shell script, i used ``` ls -l preprocessedData/ ``` the result showed that all the file I need are in this directory. However, in my python code, when I want to open these file by code ``` with open(filepath, 'r') as f: ``` it returns an error shows ``` IOError: [Errno 2] No such file or directory: 'preprocessedData/510577227.pickle' ``` (510577227.pickle is the first file i want to open in preprocessedData directory) this error showed in my submission whose run id is 8263242. After that, I revised my code and add a path check ``` if not os.path.exists(filepath): print ?warning file ? +????path + 'does not exist' else: with open(filepath, 'r') as f: ``` Then after running this code, in my running log, it shows all the files do not exists. This version's run ID is 8264686. You can check my running log to see the problem I have wasted too much time on this issue. Could you please solve it? ?lso, do you think re-runni?g my preprocess script will work?
@thomas.yu, I don't think this is a question about our caching of results. @Error202, can you explain a little more? Your statement sounds contradictory. You first say that your train script does not find your preprocessing files, but you then say that `ls -l` does list the files. Can you clarify this seeming contradiction?
Dear Dan, We do our best to cache your preprocessing results and only remove the cached files if administrative tasks require it and such tasks do not occur very often. Apologies for the inconvenience. Best, Thomas
Dear Dan, Thanks in advance for your patience. I have forwarded your issue to other challenge organizers. Best, Thomas

Cached file is in the directory, but can not be opened by training script page is loading…