**Question:**. Is it a problem if the model is not exactly deterministic? Some algorithms may depend on the order of execution of threads, for example, minimum when multiple minima are available.
**Answer**: Ideally, the end result of an experiment should always be reproducible. However, what is important for us is that your method always produces the same result when applied to the data of a given subject to generate a prediction. Imagine a scenario in a production environment: this would be a problem if your method generate different results when run on the same data of a subject. When submitting your inference method to the predictions submission queues of Sub-Challenge 1 and 2, the data from all the leaderboard subjects are provided at once to save time by avoid stopping and restarting your container for each subject to evaluate (some methods take a long time to initialize). What we request is that your method generate the same prediction for the subject A when 1) only the data from subject A are given as input to your container and 2) when the data from subject A are given as input to your container along the data of other subjects.
Created by Thomas Yu thomas.yu > also, i do need a small scratch space, probably 5kb, to put some text files in the testing phase.
Simply save your file anywhere in the filesystem of your docker container. The size of a docker container grows depending on the needs. The maximum size that a container can reach is 10 GB (default docker settings). The output asked since the beginning of this Challenge is defined [here](https://www.synapse.org/#!Synapse:syn4224222/wiki/401743).
> Output: Two scores (SL, SR), each between 0 and 1, indicating the likelihood that the subject was tissue-diagnosed with cancer within one year from the given screening exam, in the left (L) and right (R) breast respectively.
The training set represents the existing data and the test set is related to the future subjects. It only makes sense that the output of your predictor is relative to the training data and not to the test data.
> can you please just give me a one word answer, yes or no, that in the first phase the prediction values can be bigger than 1, or the result will be normalized within test set to be between 0 and 1 (thus different between separate training 1 individual and traiing all).
For the **first round** of the Challenge, the system will accept positive confidence values larger than 1. These confidence values must have been generated independently for each test subject. Starting from Round 2, your predictor will be expected to generate predictions in the requested format. ok...... can you please just give me a one word answer, yes or no, that **in the first phase **the prediction values can be bigger than 1, or the result will be normalized within test set to be between 0 and 1 (thus different between separate training 1 individual and traiing all). this is something that new and unexpected that just happended 2 days ago. now it is already christmas, it is impossible for anyone to re design anything now in the first phase. one cannot change the rules every day and request us to meet the rules. if this regulation has to be made it can be made in the 2nd phase.
also, i do need a small scratch space, probably 5kb, to put some text files in the testing phase. You have two different questions.
> as i said, then you need to allow big values, 2, 3, even 100, because it is quite likely that a test patient is 100 times more likely to have cancer than the maximal of the biggest in the training.
You have to take an extra step to normalize the confidence of your predictions based on the existing data (training set) in order to generate the predictions that we describe [here](https://www.synapse.org/#!Synapse:syn4224222/wiki/401749).
> also, the case you said is completely different from the set up here. in that case, we can check all training examples, which ever one we want, but in this case, we cannot, as training data is no long er mounted. i just need to get access to a whole set, any whole set is fine, test or taining.
a) "i just need to get access to a whole set, any whole set is fine, test or taining."
You can definitively not compare the data of a test subject A with the data of the other test subjects. According to my first answer, you have to compare the data of the test subject A to the existing/training data.
b) "we cannot, as training data is no long er mounted."
The goal of this Challenge is to develop a method that captures features/markers for the development of breast cancer. We are then asking the method, which you can see as a representation of the existing/training data, to generate a prediction from the data of a new/test subject. So far, we are still missing the justification for why you need to have access to the training data during the predictions phase, even though you have access to them during the training phase.
i have two submissions are exactly the same. yet, one time i got an error but somehow still run through, and the other time i didn't, and the prediction values are completely different, even my method is completely deterministic. the only explanation is the mounted data is different. but anyway, can you deduct my quota for these submissions? because obviously there are useless to me and i have to probably waste another some hours to figure out why the same submission can be so different also, the case you said is completely different from the set up here. in that case, we can check all training examples, which ever one we want, but in this case, we cannot, as training data is no long er mounted. i just need to get access to a whole set, any whole set is fine, test or taining. as i said, then you need to allow big values, 2, 3, even 100, because it is quite likely that a test patient is 100 times more likely to have cancer than the maximal of the biggest in the training. Hi Yuanfang,
> the prediction value of subject A would depend on the value of all other patients observed in test set.
This is definitively not something that we can have in a production environment. This is similar to asking a woman who come to an exam: "Please come back in a month to get your results, the time for us to collect data from other subjects". Your approach definitively makes sense if you use the data from the _training_ subjects (representing data collected in the _past_) instead of the test subjects (data not available at the time of the exam). What about processing the data from the training subjects, learn or list features during the training phase before individually comparing/ranking the features of a test subject against the data that you have extracted from the training set? hello i think this is impossible because i am going to use unsupervised learning and relative ranking in the testing stage, and the prediction value of subject A would depend on the value of all other patients observed in test set.
> Imagine a scenario in a production environment: this would be a problem if your method generate different results when run on the same data of a subject
in that case the values will be clustered by this test population, otherwise, you need to allow values above 1, it can be expected to have values above 1, 2, in that case. you are using auc to evaluate, and you don't allow ranking based method, what is the foundation of this?
my method will be deterministic, but it has to take in all testing examples at once.
can you please confirm at least in phase 1 it is ok? i didn't attempt to develop supervised method as the queue is so long.
Drop files to upload
Webinar #2 Q&A: Is it a problem if the model is not exactly deterministic? page is loading…