Dear Moderators, Can you please let me know where I can find the actual quantity of each cell type in the training dataset to train the model for Coarse-grained Challenge. Or we need to extract this information from their GSE- series files? Thank You

Created by Harpreet Kaur harpreet
The test data used for leader board, was also generated in the same manner?
Hi @harpreet I'm sorry for the late reply. We are not providing any proportions, percentages, or ground truth quantifying cell types. The training data that we directed you to are of purified samples--i.e., they represent the expression of only one cell type, such as memory B cells. We have annotated these samples to the best of our ability, given time constraints. But you should confirm any validations we provide. These training data could be used in at least two ways: (1) to provide you prototypes for the expression of individual cell types and (2) to generate your own admixtures that you can use for training and/or evaluation. Dominik provides a helpful response in this [thread](https://www.synapse.org/#!Synapse:syn15589870/discussion/threadId=5907) that will give you more information on creating your own synthetic admixtures. I apologize that you will have very little time left in this round to develop and test your method. We will be both extending the next round to 4 weeks and providing a 3rd leaderboard round. This should give you plenty of opportunities for further development and testing. I will make an announcement to the discussion forum shortly. Best, Brian

Regarding quantity of each cell type in training Data page is loading…