**[Full text of proposal](https://www.synapse.org/Portal/filehandle?ownerId=syn5659209&ownerType=ENTITY&xsrfToken=1EA1466FCA55F7EAE33833333900F1BC&fileName=Idea7.pdf&preview=false&wikiId=414654)** ###Anonymous Reviews and Authors Response #### Anonymous Review 1 _**Impact: ** The proposal aims to develop new predictors of T2D, its comorbidities and progression. These could lead to useful discoveries but seems unlikely to provide revolutionary insights on T2D._ _**Feasibility: ** The data to be collected is described only on a very general level. It is not clear how the planned cohort of patients would be gathered and how the privacy of the data would be handled. There is no justification for the indicated budget which seems rather optimistic compared to the 100k patient cohort used in the modelling example. Overall, the goals are extremely ambitious for such a relatively small project especially considering the complexity of the studied disease._ _**Overall evaluation:** The underlying idea of predicting T2D occurrence and development is clearly worthwhile. However even after reading the proposal it remains unclear how this could be achieved in practice. The idea of modelling disease progression looks interesting. It is however unclear why the progression should be linear as indicated, what is the significance of typical vs. atypical trajectories and would the proposed model bring any clinical added value. _ #### Anonymous Review 2 _**Impact: ** Hard to assess. See full review._ _**Feasibility: ** might work but some details need to be fleshed out._ _**Overall evaluation:** The goal of this proposal is to predict type 2 diabetes status, progression and its comorbidities and using EHR data. I?m not particularly familiar with the EHR literature, but this seems to have been attempted before both for T2D_ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3540444/ _and other diseases/disorders_ http://ajp.psychiatryonline.org/doi/abs/10.1176/appi.ajp.2016.16010077?journalCode=ajp _The proposal itself is very light on details and only contains a preliminary analysis on simulated data, but the existence of papers already implementing parts of the proposal make it somewhat more concrete and plausible. I think the major obstacle in terms of feasibility is access to actual EHR data. Based on my experience, hospitals and provider networks are often (understandably) unwilling to broadly share health records and in cases where they do, they require computation to be performed on their own systems. This could seriously complicate the organization of the challenge. _ _The impact of the proposal is hard to forecast and depends entirely on the quality and signal in the EHR data gathered._ **Response:** We thank both reviewers for their constructive feedback. We agree that collecting large EHR data is not a easy task, nor inexpensive. The budget estimation is based on our previous experience in purchasing claims data from health insurers. We are confident that the proposed project could result in real-world applications for improving quality of care and treatment of T2D patients. The large amount of data potentially has stronger signals than any study ever conducted.  

