Dear Everyone,
For the sake of the challenge, I would be more than glad if a "uniform" labeling scheme can be adopted for both the datasets.
Existing labels of both datasets do not match each other regarding the number of categories neither in the direction of them.
This makes it difficult to train a single model (using the common smart-watch accelerometer data and the meta-information).
Also, if missing (NA) labels can be imputed based on information from other labels for the same sample, that would be great.
Finally, I believe that the validation measure should definitely take class-imbalance into consideration in order to be reliable.
Sincerely,
Created by Antonio Sanchez Niklaus @Niklaus-
The variables were harmonized to be of the same "direction" across data sets, however the categories presented to the subject were different. We don't feel that it is appropriate to introduce additional categories in this case. For the purposes of your modeling, you are welcome to re normalize the labels as you see fit. However, you'll want to reverse transform your predictions at the end.
You are more than welcome to impute missing labels in your analysis. Note that for variables that are completely missing for a particular individual, you are not expected to predict values for that individual. So for example, subject 1006 is missing values for dyskinesia. Therefore, you are not expected to predict dyskinesia for that individual.
Regarding metric, please see my response in a previous posting: https://www.synapse.org/#!Synapse:syn20825169/discussion/threadId=6644.
Best,
Solly
Drop files to upload
[Suggestion] Uniform Labeling Scheme for both Datasets page is loading…