Is there any way to get logs of training and test metrics on our submitted image? Presumably the model will show up on the leaderboard ranked by the test metrics, but it would be very useful if we can get this (+training metrics) in the log files directly. Also, is there any way to get metrics on test data samples? I'm interested in model variance over resampling, but there doesn't seem to be any support for this, and no way to resample directly in our inference script.

Created by Ivan Brugere ivanbrugere
Hi @trberg I think this is less important than I originally thought. I am currently estimating variance at training/model selection by resampling validation sets. I think this is a better way to do this.
Hi @ivanbrugere, >Will we see anything that is reported in STDERR or is this at a different execution scope (i.e. the parent script which calls our script)? I presume we can't write to the logs we get sent my the docker pipeline, otherwise, we could query the data that's in scope and leak it. That's correct, we don't report any user-generated logs. >This seems like an inefficient process vs. reporting some set of scoring metrics in logs we receive. We may have a possible solution and we'd like your feedback. I'd also like to know if I completely missed the point of your question. **Proposed solution** Training Step If we mount a folder "output" to the training container (currently only done in testing/inference), participants can do internal splitting, resampling, training etc in the training container, outputting their sampled predictions to this new "output" folder. If they follow the same format as the inference predictions.csv file (person_id, score), we can generate AUCs from these multiple prediction files. We would expect one or more files named predictions_[some number].csv that represents model predictions on a sample of the persons in the person table. We would return the results to the participants through a log folder that would report the different AUCs generated on each sample. (ex. AUC_1: 0.75, AUC_2: 0.82, etc) Testing/Inference Step We would still expect a predictions.csv file that would count as the official prediction file, but we would also look for files named predictions_[some number].csv that we would use to evaluate the samples just like the training step. We would return the sampled AUCs to participants as a log file. If you're interested in how resampled trained models from the previous step perform, you can save as many models as you'd like in the model folder from the training stage and try each of those out in this step. If we built this out, would this be sufficient as a way to test model variance over resampling? Let us know your thoughts, Best, Tim
Hi @trberg, I understand we can do anything on the training data labels/metrics, including splitting training/validation and resampling on the latter. This may be a good estimate of model variance. But the logging problem seems the same for training/test. Will we see anything that is reported in STDERR or is this at a different execution scope (i.e. the parent script which calls our script)? I presume we can't write to the logs we get sent my the docker pipeline, otherwise we could query the data that's in scope and leak it. I didn't quite understand that the log files could be checked by the admins. This seems like an inefficient process vs. reporting some set of scoring metrics in logs we receive.
Hi @ivanbrugere , >Is there any way to get logs of training and test metrics on our submitted image? Presumably, the model will show up on the leaderboard ranked by the test metrics, but it would be very useful if we can get this (+training metrics) in the log files directly. At the moment, we won't be automatically returning the training or test logs to participants during the leaderboard phase. However, you can write your metrics to the log files by just printing them out in your scripts (stdout is collected from the submitted docker models ). If you send the challenge admins the submission id, we can return the portion of the logs that include the metrics. >Also, is there any way to get metrics on test data samples? I'm interested in model variance over resampling, but there doesn't seem to be any support for this, and no way to resample directly in our inference script. Unfortunately, we don't currently support this function.

Logging for training and test metrics page is loading…