Hi @trberg, In the de-identified data set that we are using in the data challenge, "dates of service are algorithmically shifted to protect patient privacy" (https://ncats.nih.gov/n3c/about/data-overview#access-requirements). 1) Can you tell us whether the order of patients was preserved with respect to diagnosis, out-patient visits, hospitalisations etc? That is, can we find out which children entered the data set early in the pandemic and which ones later one? 2) For each patient individually, have all dates been shifted by the same amount, i.e. are the time lengths between different events preserved, including from date of birth (e.g. to determine age at diagnosis etc)? Thanks! Manuela

Created by Manuela Zucknick m.zucknick
Hi @ggggfan, Yes, the dates are shifted within a limited range of time. Plus or minus 180 days. Thank you, @trberg
Hi @trberg , I am just wondering if the dates in the dataset are shifted/shuffled within a limited range of time? I just found the dates of COVID diagnosis still have predictive values in my Task 1 model. Thanks! Jifan
Hi @m.zucknick, 1. No, the order of patients is not preserved, so we can't know when during the pandemic the children entered the data set. 2. Yes, for each patient, all the dates have been shifted by the same amount so the time lengths between events have been preserved. I don't believe date of birth has been shifted. But, only the year of birth is in the de-identified data anyway. Thank you and apologies for the delayed response, @trberg

Data de-identification and shifts in dates page is loading…