Hello,
I have tried to follow the instructions on [Synapse User Guide](https://docs.synapse.org/articles/getting_started.html) to download the data, but, I cannot do it. The support page does not contain anyother information related to data download.
How do I automate the process instead of manually navigating to and downloading 65000 files.
Any help is appreciated.
Thank you,
Sai.
Created by Sai Bharadwaj Appakaya saibharadwaj Hello @salima, i have the same error message "synapseclient.core.exceptions.SynapseHTTPError: 403 Client Error".
Have you ever deal with this problem?
Any help is appreciated.
Thank you,
Liu. Hallo @saibharadwaj,
I have followed your advice, however, I always get this error message, can you please help me with it
Error:"
synapseclient.core.exceptions.SynapseHTTPError: 403 Client Error:
There are unmet access requirements that must be met to read content in the requested container
" Hi @BrianMBot,
I have the same issue like @saibharadwaj.
Could you please explain how I can match the audio-audio.m4a in the .csv file with the name of downloaded audio file?
Best regards,
Nina
Hi @tilias,
I have been trying to do as you did.
results = syn.tableQuery('SELECT * FROM syn5511444 WHERE medTimepoint='Immediately before Parkinson medication' LIMIT 100 OFFSET '+str(offset))
It shows that there is some syntax error and later it shows 'column names with spaces should be given in double-quotes'. When I give those, I end up with some other error.
Any help? hi, @saibharadwaj
The column "audio_audio.m4a" in the excel sheet (this excel sheet in CSV format can be downloaded from 'Tables' tab of 'Voice Activity' using 'Export Table' under 'Download Options') just the code like 5404133.But I don't kown how to map to a .tmp file?
I don't know what's wrong. I have been reporting this mistake for many times
How to solve it?
error messages?
>>> import synapseclient
>>> syn = synapseclient.Synapse()
>>> syn.login('chenyson', '123456abc1')
Exception in version check: ('Connection aborted.', ConnectionResetError(10054, '?????????????????', None, 10054, None))
Welcome, chuangsen xie!
>>> for offset in range(0,65000,500):
... results = syn.tableQuery('SELECT * FROM syn5511444 LIMIT 500 OFFSET '+str(offset))
... file_map = syn.downloadTableColumns(results,['audio_audio.m4a'],'D:/voiceTools/mpower')
...
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 0 files, 500 cached locally
Downloading 500 files, 0 cached locally
Processing FileHandleId :5595915 [####################]100.00% 500/500 Done...
Downloading [--------------------]1.30% 8.0MB/613.1MB (226.4kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]2.61% 16.0MB/613.1MB (252.9kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]3.91% 24.0MB/613.1MB (251.9kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]5.22% 32.0MB/613.1MB (306.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]6.52% 40.0MB/613.1MB (263.1kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]7.83% 48.0MB/613.1MB (250.3kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]9.13% 56.0MB/613.1MB (247.5kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]10.44% 64.0MB/613.1MB (253.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]11.74% 72.0MB/613.1MB (264.7kB/s) table_file_download.zip.synapse_download_75283341
Downloading [###-----------------]13.05% 80.0MB/613.1MB (274.5kB/s) table_file_download.zip.synapse_download_75283341
Downloading [###-----------------]14.35% 88.0MB/613.1MB (268.5kB/s) table_file_download.zip.synapse_download_75283341
Downloading [--------------------]1.30% 8.0MB/613.1MB (131.3kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]2.61% 16.0MB/613.1MB (124.4kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]3.91% 24.0MB/613.1MB (118.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]5.22% 32.0MB/613.1MB (95.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [--------------------]1.30% 8.0MB/613.1MB (264.0kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]2.61% 16.0MB/613.1MB (218.0kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]3.91% 24.0MB/613.1MB (196.3kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]5.22% 32.0MB/613.1MB (202.1kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]6.52% 40.0MB/613.1MB (212.3kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]7.83% 48.0MB/613.1MB (207.2kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]9.13% 56.0MB/613.1MB (238.9kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]10.44% 64.0MB/613.1MB (230.0kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]11.74% 72.0MB/613.1MB (221.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [--------------------]1.30% 8.0MB/613.1MB (141.4kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]2.61% 16.0MB/613.1MB (180.4kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]3.91% 24.0MB/613.1MB (196.9kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]5.22% 32.0MB/613.1MB (208.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]6.52% 40.0MB/613.1MB (234.4kB/s) table_file_download.zip.synapse_download_75283341
Downloading [##------------------]7.83% 48.0MB/613.1MB (240.6kB/s) table_file_download.zip.synapse_download_75283341
Downloading [--------------------]1.30% 8.0MB/613.1MB (97.5kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]2.61% 16.0MB/613.1MB (79.7kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]3.91% 24.0MB/613.1MB (99.7kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]5.22% 32.0MB/613.1MB (112.8kB/s) table_file_download.zip.synapse_download_75283341
Downloading [#-------------------]6.52% 40.0MB/613.1MB (125.1kB/s) table_file_download.zip.synapse_download_75283341 Traceback (most recent call last):
File "D:\ProgramData\Anaconda3\lib\site-packages\urllib3\response.py", line 436, in _error_catcher
yield
File "D:\ProgramData\Anaconda3\lib\site-packages\urllib3\response.py", line 518, in read
data = self._fp.read(amt) if not fp_closed else b""
File "D:\ProgramData\Anaconda3\lib\http\client.py", line 458, in read
n = self.readinto(b)
File "D:\ProgramData\Anaconda3\lib\http\client.py", line 502, in readinto
n = self.fp.readinto(b)
File "D:\ProgramData\Anaconda3\lib\socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "D:\ProgramData\Anaconda3\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "D:\ProgramData\Anaconda3\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [WinError 10054] ?????????????????
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\ProgramData\Anaconda3\lib\site-packages\requests\models.py", line 751, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "D:\ProgramData\Anaconda3\lib\site-packages\urllib3\response.py", line 575, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "D:\ProgramData\Anaconda3\lib\site-packages\urllib3\response.py", line 540, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "D:\ProgramData\Anaconda3\lib\contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "D:\ProgramData\Anaconda3\lib\site-packages\urllib3\response.py", line 454, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(10054, '?????????????????', None, 10054, None)", ConnectionResetError(10054, '?????????????????', None, 10054, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 3, in
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\client.py", line 3582, in downloadTableColumns
zipfilepath = self._downloadFileHandle(response['resultZipFileHandleId'], table.tableId, 'TableEntity',
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\client.py", line 1845, in _downloadFileHandle
downloaded_path = self._download_from_url_multi_threaded(fileHandleId,
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\client.py", line 1890, in _download_from_url_multi_threaded
multithread_download.download_file(self, request)
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\core\multithread_download\download_threads.py", line 232, in download_file
downloader.download_file(download_request)
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\core\multithread_download\download_threads.py", line 297, in download_file
self._write_chunks(request, completed_futures, transfer_status)
File "D:\ProgramData\Anaconda3\lib\site-packages\synapseclient\core\multithread_download\download_threads.py", line 372, in _write_chunks
chunk_data = chunk_response.content
File "D:\ProgramData\Anaconda3\lib\site-packages\requests\models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "D:\ProgramData\Anaconda3\lib\site-packages\requests\models.py", line 754, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(10054, '?????????????????', None, 10054, None)", ConnectionResetError(10054, '???????????????????', None, 10054, None))
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> .
File "", line 1
.
^
SyntaxError: invalid syntax I think i find it. Thanks!
downloadTableColumns(table, columns, downloadLocation=None, **kwargs)
Bulk download of table-associated files.
Parameters
table ? table query result
columns ? a list of column names as strings
downloadLocation ? directory into which to download the files
Returns
a dictionary from file handle ID to path in the local file system. How do I change the save address of my laptop? Hello @Sharanyaa,
here is how i did it:
SELECT * FROM syn5511444 where medTimepoint like 'I don%'
which means:
Select all rows where column medTimepoint has a value with a prefix of 'I don'. In this example the right-hand-side of the LIKE keyword is a regular expression where the '%' represents one or more characters or even zero characters.
you can find Synapse SQL commands in this link: https://rest-docs.synapse.org/rest/org/sagebionetworks/repo/web/controller/TableExamples.html Hi @Sharanyaa
to escape, use a single quote twice:
Eg, `SELECT * FROM syn5511444 where medTimepoint='I don''t take Parkinson medications'` query for selecting medTimepoint works for the following statements.
Eg, SELECT * FROM syn5511444 where medTimepoint='Another time'
but SELECT * FROM syn5511444 where medTimepoint='I don't take Parkinson medications' shows error since single quotes comes in between don't. double quotes also does not work for string. what to do. Hi @saibharadwaj,
I have a similar question than @chisty2996. Once you submit the use data statement, how long do they take to accept or reject the data request?
thanks
Catalina Hi @chisty2996 ,
Can you please explain what you mean by profile validation? Sometimes I am not familiar with the terminology.
Sai. how many days it takes to do the profile validation thing?
I need to access the mPower data. But cannot do so. Any help? Hi @saibharadwaj,
No, i didn't try it, my downloads are based on your first code with a slight modification in order to query data based on the status of the patient, and i have changed '100' to '500' too.
like this:
SELECT * FROM syn5511444 where medTimepoint='Another time'
SELECT * FROM syn5511444 where medTimepoint='Immediately before Parkinson medication' Hi @tilias,
No, I did not. Is there an issue with that code?
Hi @saibharadwaj
Thank you for your kind answers, but have you checked this code that is published in Github to download the mPower data with Python?
https://github.com/Sage-Bionetworks/mPower-sdata/blob/master/examples/mPower-bootstrap.py
Thanks. Hello,
Here is the code I used to rename the downloaded .tmp files to .m4a and have their file names match the foldernames/codes from CSV file.
```
import os, shutil, csv
names = []
dest = "C:\\Users\\Sai Bharadwaj A\\Desktop\\dest"
d = '.'
A = [os.path.join(d, o) for o in os.listdir(d)
if os.path.isdir(os.path.join(d,o))]
for i in range(len(A)):
A1 = [os.path.join(A[i], o) for o in os.listdir(A[i])
if os.path.isdir(os.path.join(A[i],o))]
A1_names = [o for o in os.listdir(A[i])
if os.path.isdir(os.path.join(A[i],o))]
for i1 in range(len(A1)):
A2 = [os.path.join(A1[i1], f) for f in os.listdir(A1[i1]) if f.endswith('.tmp')]
if len(A2) != 0:
shutil.copy(A2[0],dest+'\\'+A1_names[i1]+'.m4a')
names.append(A2)
# Code to save the folder structure for future reference
# An empty csv file 'tmp.csv' has to be created before running this code
csvfile = "C:\\Users\\Sai Bharadwaj A\\Desktop\\tmp.csv"
#Assuming res is a flat list
with open(csvfile, "w") as output:
writer = csv.writer(output, lineterminator='\n')
for val in names:
writer.writerow([val])
```
Sai. Hi @tilias,
It came down to 81.3 GB. So, you're looking at one more day. Also, as you ight have already seen, once they are downloaded, the file names are different. The folder contains two tmp files, one of which is the .m4a file. I will try to get the code that can automatically rename the files and put them all in one folder. As each file has a unique code, it should be easy to map them to the CSV file.
For anyone else out there who has some coding experience, please try to increase the '100' to '500', hopefully that should reduce the total download time.
Sai. Thank you @saibharadwaj for the solution,
However it's been almost a day the data is downloading with almost 40GB of data but it is not finished yet, do you have any idea how much its size?
Thank you. Thank you @BrianMBot for the reply.
I have already visited those sites, but they were not of huge help. However,I found a way to download the data and wanted to share the procedure that worked for me here.
The process shown [here](https://python-docs.synapse.org/build/html/index.html) helped in getting till logging into the system. After successfully logging in, I found a way to download the audiofiles automatically, here is the code for that:
```
for offset in range(0,65000,100):
results = syn.tableQuery('SELECT * FROM syn5511444 LIMIT 100 OFFSET '+str(offset))
file_map = syn.downloadTableColumns(results,['audio_audio.m4a'])
```
This downloads files in packets (each packet containing 100 files). I am working on a windows laptop, for my they are getting downloaded into "C:\Users\\.synapseCache\" .
In this folder, after the download is complete, you can find thousand numbered sub folders (named 0 to 999), each containing sub-folders (again) with names matching the codes under column "audio_audio.m4a" in the excel sheet (this excel sheet in CSV format can be downloaded from 'Tables' tab of 'Voice Activity' using 'Export Table' under 'Download Options').
It is not necessary that the process has to be uninterrupted. Even when the laptop went into sleep mode or worse case restarted, I could run the same code and continue the download.
Hope that helps.
Sai.
Hi Sai,
There are a couple of examples that are linked from the FAQ section on [How can I access the data programmatically](https://www.synapse.org/#!Synapse:syn4993293/wiki/394516). An example in python is located [here](https://github.com/Sage-Bionetworks/mPower-sdata/blob/master/examples/mPower-bootstrap.py) and general python client documentation for Synapse Tables is located [here](https://python-docs.synapse.org/build/html/Table.html).
Hope this helps!
-Brian