I try to download data with supplied python script several times and encounter this error each time.
requests.exceptions.ChunkedEncodingError: ("Connection broken: error(104, 'Connection reset by peer')", error(104, 'Connection reset by peer'))
Every time, I started downloading and this error interrupted download process after several hours.
Anyone find this problem ?
Created by wmmthu Dear Bharata,
Please refer to the post above by `chris.bare`. I think he addresses the same issue that you are running into. The python client should realize that you have downloaded the data already (even if you specify a download directory). Your download will start up where the connection failed, so feel free to run the script again if you encounter the connection error. Thank you for your participation.
Best,
Thomas Hi Bharata,
You can directly download the DNase data without the BAM files here if you continue to run into problems. https://www.synapse.org/#!Synapse:syn6403553
Cheers,
Jim Yes, I have found the cache. I modified the code a bit so that I can choose which folder to download so that I can inspect the downloaded file. Another thing is I still get error connection broken when downloading the DNAse file. I don't know how to handle this problem. I will try downloading one by one for now. Bharata,
The script will download files to the local cache unless you define the destination directory in the code.
Please see the comments in the code or the documentation on the Python script for instructions on how to do this. Please note that not all the data is needed to address the challenge, specifically the BAM files are summarized in data files that are downloadeable. The original scripts listed all the data for download, but note that the scripts have been updated to add a bit more documentation onto how to download the data and speciflcally note the essential data, which is additionally defined in the Data Description section of the Challenge pages. Same error here, have been trying to download since Friday and no success. Another problem is, I can not see any file that have been downloaded. How does this script work? Is this script downloading the file to temporary storage first? I can not see any downloaded file in my directory. Thanks for your reply.
I restart download process and find it useful as you said. Regarding download errors:
This challenge provides a lot data. The Python client contains logic to retry routine failures, but that logic isn't up to the size of the job. It was written to have a good probability of success on a single download of up to 1 or 2GB, but here we're asking it to work on a couple hundred such files. We're reworking the file-download section of the client to add more robust error handling and recovery.
In the meantime, the pragmatic solution is to retry the whole download. It should quickly confirm the files that have already been downloaded, then resume downloading at the beginning of the file where the error occurred.
Encountered the same problem as well