Challenge Participants-
The webinar [slides](syn10656772) and [recording](https://drive.google.com/file/d/0B4Gply5UVfcjeWhmaWc2cUM3SFU/view?usp=sharing) are now available for those who missed the live event. Below is a transcript of the Q&A. If you have questions that have not been answered here, please post them to the discussion board.
**Webinar Q&A**
**Q:** In subchallenge 2, how many neurologist are involved? each subject is scored twice (on and off), by the same person? (Kely Norel)
**A:** There are multiple Neurologists but each patient is scored by the same neurologist.
**Q:** In subchallnege 2; the OFF state, how "OFF" is it me overnight with no meds? just a few hours with no meds? (Kely Norel)
**A:** There are varying degrees of medication status for each task, so there is no clear on/off state. Instead, please focus on the clinical scores (tremor, bradykinesia, dyskinesia) for each task.
**Q:** For L-DOPA data, is it just 19 subjects? (Kely Norel),
**A:** Yes 19 subjects in the training data.
**Q:** For subchallenge 2? how was determined "worse" side? wexx see for 13_BOS worse data for "better" side based on mmpcensor used.(Kely Norel)
**A:** The worse side was determined during a clinical exam prior to testing.
**Q:** So the sampling frequency is not global? It is just aournd 90? (Junrui Di)
**A:** That is correct. The sampling feature is consistent within one task but can vary between tasks and participants. It is around 90Hz though.
**Q:** Could you please explain what is additional training data? And why is it provided? (Avinash Bukkittu)
**A:** The training data will be supplemented with additional data from the same individuals that has already been released but collected at later time points. We are providing this data as there was some changes in later versions of the application and we will use data from these later versions if there is a tie.
**Q:** Based on the timstamp, the order of the event is outbound, return, and then rest. But from the demonstration example, I thought it should be walking out, then standing still, and then walking back. So that "rest" and "stand still" not the same? (Junrui Di)
**A:** The rest, stand still are identical exercises and are also sometimes called the balance test. The order can be slightly different between versions of the app - instructions changed between early and late versions of the app.
**Q:** Are we only allowed to make a single submission? (Naozumi Hiranuma)
**A:** No, you are allowed 2 submissions for Subchallenge 1. The number of submissions for Subchallenge 2 is to be determined.
**Q:** So, to understand, given the data provided, you are asking us to generate some transformations that you will then test against some model? If so what model will you be training/testing our features on? (Rajeswari Sivakumar)
**A:** An ensemble model will be built from your features on the training data. This ensembl model consists of Random Forests, ElasticNet, KNN, SVM and NeuralNets will then be used to predict the PD/Control labels on the test data.
**Q:** Are there limitations to the language used? You've mentioned python and R, but can we use other languages? (Peter Brooks)
**A:** No, though we prefer you use open source software whenever possible.
**Q:** Do I understand it correctly that age and gender information will not be included in the test data? (Cedric Simillion)
**A:** Correct. You will only receive walking test sensor data and metadata in the test set.
**Q:** What is the expected dimensionality of the feature vector evaluated in challenge 1? (Patrick Schwab)
**A:** There is no upper limit on the number of features generated, but you must provide at least one. Missing values are not allowed.
**Q:** Hi , for the test data, what is the meaning of using the median to train and predict? (Xinlin Song)
**A:** Predictors will be summarized within healthCode and medTimepoint using the median value prior to model fitting.
**Q:** Is this machine learning or bayesian statistics? (Monte Shaffer)
Machine learning is generally about finding an outcome without understanding the model or mechanism? Support vector machines optimize without understanding what is optimized. If the goal is to extract PD characteristics from the wearable data, it seems to me like this is more of a Bayesian statistics problem - a model based on some theorectical foundations (e.g., cadence, sway, amplitude/frequence of tremor, and so on).
**A:** Ideally we would get interpretable features out of this challenge but the mechanisms for generating features can make it difficult to interpret to begin with. We have chosen to use machine learning methods to score models during the competitive phase of the challenge. We will explore whether there are interpretable characteristics among the predictors in the community phase.
**Q:** For Tremor clinical scores, does higher scores indicate worse symptoms?
**A:** Yes.
**Q:** Will the test set of the mPower challenge contain records with missing files, such as without balance test data?
**A:** Potentially, yes.
**Q:** In subchallenge 1, are the age and gender the only external features (not extracted by us) you will use?
**A:** For Subchallenge 1, yes.
**Q:** Just to make it clear, for the L-dopa data, do you want prediction per acceleration sample? or per activity?
**A:** You should submit features for each recordId in the training and test sets.
**Q:** Do we need to extract features of all the records in mPower training set, or we can use a subset as training set? Thanks!
**A:** You should submit features for each recordId in the training and test sets.
**Q:** Is the same clinician that scores both times for the same subject?
**A:** Yes.
**Q:** How are the predictions of different models in the evaluation ensemble in challenge 1 combined to form the final AUROC score - Average over all models or highest?
**A:** Using a simple, linear greedy optimization on AUC using caretEnsemble in the caretEnsemble R package.
**Q:** Can you repeat if there will be release of the demographic data for subchallenge 1? Are the ground truth information going to be released?
**A:** No demographics, outcomes, or medication information will be released for the testing data until after the close of the challenge.
**Q:** Are there any existing scoring benchmarks for the features for the teams to be targeting that are available with the organizers?
**A:** There is example code for feature extraction available on the challenge website.
**Q:** For mpower, so the sampling frequency is not the same for all subjects?
**A:** That is correct. The sampling feature is consistent within one task but can vary between tasks and participants. It is around 90Hz though.
**Q:** For L-dopa subchallenge, which exact information will be available? Do we just get the acceleration data? or do we get to know on what activity with what device on what side of body tasks are being performed?
**A:** You will have access to the device, activity and side.
**Q:** So: The software versions are different with respect to the activities, not only signal processing - correct?
**A:** Yes.
**Q:** for mpower data,for parkison patient, are we supposed to see higher amplitude from the signal?
**A:** Not necessarily but there are probably features that would have higher amplitude for PD patients.
**Q:** what are the metrics to measure feature performance? for example, accuracy of diagnosis PD patients?
**A:** For subchallenge 1, we will predict Parkinson?s diagnosis. For subchallenge 2, we will predict tremor, bradykinesia and dyskinesia severity scores.
**Q:** Will feature selection be done or should we perform this ourselves?
**A:** We will fit the models to ensure uniform methodology across submissions.
**Q:** Will taking the median across health-codes mean that that the different medTimepoints will be averaged out?
**A:** No, we will factor in medTimepoint, excluding some from the analysis.
**Q:** What is a "Feature"? can a final true/false prediction be considered as "Feature"?
**A:** In theory, yes. but they might not make for the best feature considering other covariates included in the model and would be missing. Furthermore the would be poor features for severity prediction (in collaborative phase). Features must be numeric, so if you choose to predict outcome rather than submit features, please code your submissions accordingly.
**Q:** Where to find a reader of the mPower .tmp file?
**A:** They are text files with json data.
**Q:** Will the drug name and dose information for each subject be available in the data?
**A:** No, these data have not been collected.
**Q:** For the L-dopa data, for example the drinking activity has accelerometer readings from both right and left hands. Do we want two predictions for that or a single aggregated prediction?
**A:** Individual predictions for each limb separately.
**Q:** Is it acceptable to build a classifier based on all my extracted features and report my final prediction as a "Feature" for your Ensemble classifier?
**A:** In theory, yes. but they might not make for the best feature considering other covariates included in the model and would be missing. Furthermore the would be poor features for severity prediction (in collaborative phase).
**Q:** So the feature of each haelthCode is acutally median of features of all the records with same haelthCode? Is each record or each healthcode will be put into the machine learning algorithm?
**A:** Predictors will be summarized within healthCode and medTimepoint prior to model fitting, providing a single set of measures per healthCode but you have to provide individual features for each recordId.
**Q:** Concerning mPower: Can features be extracted from the time-series measurment as well as the metadata (e.g. PhoneInfo)? Or are we only allowed to extract features from the time-series measurments?
**A:** You will have access to phone version, app version and other walking test metadata but not medTimePoint or demographic data.
**Q:** Sorry I think I did not get my point across. For L-dopa data, during the testing stage, do we have the information of which task is being performed, which hand is used to collect data, and what device is mounted on the hand.
**A:** Yes.
**Q:** How do you label each activity? Based on the participant who took it? For example, if a PD patient perform well on an activity, we still label the activity as a PD activity, correct?
**A:** For subchallenge 2, all participants are PD patients. The outcome we will predict are the symptom severity scores. For subchallenge 1, we are predicting PD status. Severity of symptoms is not taken into account.