Dear Committee, Our model uses stats.norm function at the end of its process to transform resulting distribution into uniform instead of Gaussian form. Would this procedure be allowed for this challenge? Best Regards, Yeojin

Created by Yeojin Shin syjssj95
@penzard The extent of the transformation is not something we can quantify. So it is hard for me to tell everyone where we will have to draw the line. If someone is trying a secondary network, we will most likely discard it. We wouldn't, obviously, allow it if another network is stacked on top of a network or an SVM/random forest is applied to the feature space of the neural network. If you are trying to make a transformation like StandardScaler (where you learn/calculate the std and mean (or sth else) using training data), I am sure that is something everybody would want to know. However, remember that fitting such StandardScaler like models to transform predictions could overfit your method on the public leaderboard data.
@muntakimrafi Say, I'll take predictions of my neural network model for the whole train data and train some transformation on training data. For the sake of simplicity - StandardScaler (it won't change pearson and spearman metrics but there are other transforms which can). Now I have the following pipeline for any sequence: 1. Predict expression for sequence using neural network 2. Transform expression using StandardScaler (again, I take this transformation as an example) From my point of view, now I have very degenerate but example of an ensemble - one model take predictions from the previous one to produce new predictions and this model has trainable parameters (mean and sd). But according to your answers it looks like I still can use the following scheme. Moreover, if its true next question arises - to which extent such a transformation can be used? Can I use linear model to scale predictions? some polinomial-based model? RandomForest?
@muntakimrafi Cool, once again thank you so much for carefully clarifying all the burning questions. It was (and it is) a great fun to participate! @syjssj95 you might check the **muntakimrafi's** comment above, as it might be related to your original question.
@ivan.kulakovskiy Yes, you are correct. Score scaling for a particular single sequence is allowed if it does not rely on predictions for any other sequences in the test set. You should be able to predict expression if I am giving only a single sequence. @penzard, I am not sure how this would work. If your neural network predicts x as expression, what does it mean when you want to transform this prediciton? Let's say your network might be predicting expression in bins first and then transforming this bin prediction to actual expression value. You can do that. You can fine-tune the bin2expression layer, too if you want. But you first train a seq2exp network. Then you extract the features from the first model and then use another model to make predictions on these features, then it feels like you have two different models with two separate tasks.
@muntakimrafi if I train an supervised transform using train data, than I have two models: 1) neural network 2) supervised model used for neural network output transformation I use neural network for prediction and then use supervised model to transform neural network predictions. It is an ensemble by the definition, isn't it? If I use an unsupervised transformation, I use predictions for other test sequences to transform prediction for the given one. This procedure may result in quantiles, log, x^2 (whatever smbd wants), etc. Such procedure generates not expression you asked to predict, but a completely irrelevant score with the only aim to maximize pearson and/or spearman metrics.
@muntakimrafi I am super sorry but I still feel confused. (1) Using an arbitrary f(pred[i]) => new_pred[i] function is allowed if (2) it can be used independently of other pred[n], n != i I.e. score scaling for a particular single sequence is allowed IF and ONLY IF it does not rely on predictions for any other sequences in the test set, am I right?
@penzard What kind of supervised transformation do you have in mind? Using the model's predictions on test sequences to further modify the expression prediciton should fall under pseudo labeling the test sequences which is not allowed. But you are free to fine-tune the final layers (stepwise training) and/or apply any transformation that u came up with using the information from the training data or underlying biology (test-time augmentation, which is allowed).
@muntakimrafi Also, the task is stated as follows - train model to predict EXPRESSION. Any additional transformation, especially done using information about predictions for other test sequences will solve different task.
@muntakimrafi But a supervised transformation on the top of a trained model is an ENSEMBLE, isn't it?
@syjssj95 can you clarify your postprocessing a bit more? Do you first make predictions on all test sequences and then apply the post-processing? If that is something you are doing then what if I give only sequences with high or low expressions to your model? @ivan.kulakovskiy test-time augmentation is allowed ([Rule 8](https://www.synapse.org/#!Synapse:syn28469146/wiki/617562)). I think such unsupervised transformations would fall under that. A supervised transformation with something that is not complex would be fine too.
@muntakimrafi but this basically means that the model cannot produce the prediction for the validation data if the sequences are supplied one-by-one, i.e. the model or the postprocessing code must have/consider the whole validation data in the end of the pipeline to transform the resulting distribution of the prediction scores. Do you consider it OK? I.e. do you allow a generic unsupervised transformation of the final prediction scores? One could expect that a monotonous transform won't affect the resulting Spearman metric, but Pearson scores can be significantly altered. Do you allow a supervised transformation as well? I.e. one could train a fairly basic model to recalibrate/rescale the scores towards whatever target distribution.
Yeah sure.

Would changing the final distribution be allowed for our model? page is loading…