It seems that the following mixture components taken from the training set data file (Mixure_Definitions_Training_set.csv) do not have corresponding chemical data at least one of the chemical descriptor files (Dragon_Descriptors.csv, Mordred_Descriptors.csv, Morgan_Fingerprint.csv).
The mixture component CIDs in question are:
7284, 11173, 84682, 5284503, 5318042, 11002307 (missing from Dragon set)
66328, 78605, 11002307, 19789253, 25137858 (missing from Mordred and Morgan sets)
while 11002307 is missing from all of the sets.
Thank you.
Created by Jeremy Kotlyar jnk327 Hi thanks for your interest and question
11002307 is wrongly parsed and should be two compounds, 11002 and 307,
we don't have access to all descriptors for the Dragon set as this is a legacy software so some CIDs might be missing.
@JakeAlbrecht @gaia.andreoletti can you check about 66328, 78605, 19789253, 25137858 not being generated in Mordred and Morgan sets?
Drop files to upload
Mixture Components Missing from Chemical Datasets page is loading…