Hi,
As suggested in the "bonus hint" on https://www.synapse.org/#!Synapse:syn18666641/wiki/594209, I've attempted to map the Drug Screening data to the Drug Target Explorer dataset. The instructions on the related Data Walkthrough (https://www.synapse.org/#!Synapse:syn20552339) are clear. However, unfortunately the commands listed there do not seem to exist for me. I am using a Windows OS, and the most recent version of rpy2 available for Windows is 2.9.5, but it appears that the latest version is 3.1.0. So I suspect the suggested commands are not in the latest version of rpy2 available to me.
Would it be possible for you to do the mapping mentioned on the Walkthrough and then export the dataframe in csv format? To be specific, this would involve stepping through the first few commands up through the creation of the targets_filt dataframe (i.e. targets_filt = (targets.query(...).filter(...).etc). Maybe even tweak that to include the mean_pchembl value in the output too. Then the targets_filt dataframe could be exported with a command like:
with open('drug_target_map.csv', 'w') as f:
... targets_filt.to_csv(path_or_buf=f,index=False)
Possibly the indented line would need to be something like: pd.DataFrame.to_csv(targets_filt, ...)
or, alternatively, targets_filt might need to be converted to a pandas dataframe before csv export with something like:
df = pandas2ri.ri2py(targets_filt)
then df could be exported to csv with the lines above.
It seems likely that others would benefit from this as well.
Thank you,
Jeff
P.S. In case it's useful, when I use rpy2 on Windows, I get the following error when running the command to create the targets_filt dataframe as specified in the Walkthrough:
AttributeError: 'DataFrame' object has no attribute 'query'
When I try to simply see the head of targets, I get the following:
pd.DataFrame.head(targets)
AttributeError: 'DataFrame' object has no attribute 'iloc'
targets seems to be some kind of R dataframe, so when I try to convert to a pandas dataframe, I get the following:
df = pandas2ri.ri2py(targets)
ValueError: Buffer for this type not yet supported.
The same occurs even if getting only the hugo_gene and std_name columns from targets.
I even attempted to print targets and then redirect the console output to a file, but the number of written rows is limited so data is lost, and setting various "max lines" properties found online seems to have no effect.
So it seems like it would be much easier if someone with access to the right tools could quickly export this kind of data.
Thank you for your time.
Created by Jeff Green jeff_green Thank you, Robert, that looks great! Hi Jeff - So sorry for the delay on this.
Attached: https://www.synapse.org/#!Synapse:syn20773923
Hi Robert,
Yes, if you don't mind, it would be great if you could export the csv version.
Unfortunately, we won't get a chance to talk with Dr. Gutmann today, but thank you for the recommendation.
Jeff (/@David Gutmann on slack) Hi Jeff, are you all set here? Thanks!
By the way, you might be interested in running your ideas by one of our remote mentors. David Gutmann is available for the next few hours (9-1 PT today) on slack and has a lot of expertise in the area your team is working in! This is correct. Thanks Francis!
If you are still having issues @jeff_green let me know and I can get you a csv format version when I get to the venue after 9.
Best,
Robert
Or rather grab @lars.ericson RPY2 is terrible for Windows. Robert rewrote the notebook to use something else called "feather". If you update the notebooks from the GitHub you should find it using feather.
Also I posted a Conda environment list for Windows here: https://www.synapse.org/#!Synapse:syn18666641/discussion/threadId=5904
If you're in person at the Hackathon and still have problems, just grab me and I will help you sort it out.
Drop files to upload
Mapping Drug Screening data to the Drug Target Explorer dataset page is loading…