as i see all teams have trouble making a valid submission, i wrote a short code for sanity check : it shall report what error exists in your submission.
**
usage:
perl sanity_check_single.pl
**
input file is a single file in your submission, e.g. 1_ppi_anonym_v2.txt
**
when you win the challenge and if this is helpful, don't forget to acknowledge me, LOL.
Created by Yuanfang Guan ???? yuanfang.guan added this function and updated. thanks a ton for your suggestion, dibya!! I may be wrong (feel free to correct), but I believe that the scoring script errors out when there are no valid modules: For example, when submitting today, I recieved the error: submission/3_signal_anonym_directed_v3.txt has no module in size range 3-100
hi, Dibya,
it's my great pleasure, really!
originally i ask the code to die, if i see a cluster out of the required range. based on daniel's suggestion, i made it into print a line of warning, (i included 3 and 100 as valid, i.e. 3<=#<=100--- is that what is done on the evaluation side, daniel?). it does pass when even zero valid clusters is found (and print a list of warnings), to be consistent on the leaderboard.
thanks and good luck for your submission.
yuanfang
This is really helpful! One thing that would be nice to have is checking whether there are any valid clusters (eg a cluster w/ 3 < size < 100). From what I see currently, the Perl test passes even if the submission files have no communities w/ valid sizes. updated to warning. i think that makes more sense and works with more methods. We did some tests and the script is consistent with our validation of the submission format. A minor difference is that modules outside the size range (3-100) are simply ignored in the challenge submissions, i.e. do not lead to a formatting error. Maybe this could be changed to a WARNING in your script, but either way it's good to be aware that there are such modules in a submission
Thanks again,
Daniel That's really useful, thanks for sharing! We'll have a closer look later and let you know if we have any feedback.
--daniel
UPDATE:
i changed the files to take in directly submission files (zip files ) and check the following:
http://guanlab.ccmb.med.umich.edu/yuanfang/sanity_check.tar
1. if cluster size is ok, 3-100
2. if repeated genes.
3. if non-existing genes appears.
4. if zip file contains wrongly named files or missing one of the networks.
5. wrong seperations (reported as wrong genes)
i think i have covered everything. but i cannot check very well; so if anyone finds anything that i didn't cover please send in a message.
it runs like:
perl sanity_check_sc1.pl sc1.zip
perl sanity_check_sc2.pl sc2.zip
//
e.g. this is what i run
perl sanity_check_sc1.pl ../submission_sc1/2016_8_16/i_support_trump.zip
perl sanity_check_sc2.pl ../submission_sc2/2016_8_16/nicebug.zip
//
if it is successful it prints success, other wise it reports the above errors. there i some temp files that i named with my id. so that it won't overlap with any files that one needs to use.
Drop files to upload
30-line code for sanity check before submission page is loading…