First, with latest synapser release. Next, with latest python client, installed via pip. Trace shown below. Before this complete failure, the "Downloading ..." message looped between 0 and 2% complete, restarting at 0 several times. Python 3.7.1 on ubuntu, good fast connection to the internet (South Lake Union, Seattle). Rumor has it that these sorts of synapse download failures are not uncommon. - Paul Shannon 206.658.3789 >>> x = syn.get('syn11714133', downloadLocation=".") Downloading [--------------------]1.72% 264.0MB/14.9GB (891.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_22409246 Traceback (most recent call$ File "/users/pshannon/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 360, in _error_catcher yield File "/users/pshannon/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 442, in read data = self._fp.read(amt) File "/users/pshannon/anaconda3/lib/python3.7/http/client.py", line 447, in read n = self.readinto(b) File "/users/pshannon/anaconda3/lib/python3.7/http/client.py", line 491, in readinto n = self.fp.readinto(b) File "/users/pshannon/anaconda3/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) File "/users/pshannon/anaconda3/lib/python3.7/ssl.py", line 1052, in recv_into return self.read(nbytes, buffer) File "/users/pshannon/anaconda3/lib/python3.7/ssl.py", line 911, in read return self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer

Created by Paul Shannon paul-shannon
using docker.synapse.org/syn25326461/synpy-1128, downloading large vcf files is, alas, **still** unreliable: ``` >>> x = syn.get("syn11714079", downloadLocation="/tmp") x = syn.get("syn11714079", downloadLocation="/tmp") Downloading [####################]99.98% 41.9GB/41.9GB (47.9MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_3.recalibrated_variants.vcf.gz.synapse_download_74048101 ``` This has been frozen for 30 minutes at 99.98% complete. There is plenty of disk space on the receiving end.
@jordank Great it works now! Thanks a lot Jordan!! Best, Jiali
@jzhuang_denovo Test version 2.3.1.368 actually does not include the change, as it is a later test build of 2.3.1 which ultimately did not include this in its final release (it was included preliminarily in some earlier 2.3.1 builds including the latest that was available as of 3/23 but is now scheduled for 2.4 instead. You can try a build that does include this change by installing an earlier 2.3.1 build that includes it, e.g. 2.3.1.326 (using the == qualifier). There is not yet a 2.4 test build. ``` pip3 install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple "synapseclient==2.3.1.326" ``` If you could try this and see if it resolves the issue for you. Thanks.
I encountered very similar situation as Paul's. And using version 2.3.1.368 using Jordan's code doesn't resolve it. Below are the error messages: [jzhuang@localhost AMP]$ synapse --debug get syn10507730 Welcome, Jiali Zhuang! 2021-04-19 16:42:18,797 [client:434 - INFO]: Welcome, Jiali Zhuang! Downloading [--------------------]1.96% 8.0MB/407.9MB (931.4kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downloDownloading [#-------------------]3.92% 16.0MB/407.9MB (966.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#-------------------]5.88% 24.0MB/407.9MB (974.7kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]7.85% 32.0MB/407.9MB (964.3kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]9.81% 40.0MB/407.9MB (978.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]11.77% 48.0MB/407.9MB (988.6kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downDownloading [###-----------------]13.73% 56.0MB/407.9MB (1001.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [###-----------------]15.69% 64.0MB/407.9MB (1010.6kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]17.62% 71.9MB/407.9MB (1013.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]19.59% 79.9MB/407.9MB (1015.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]21.55% 87.9MB/407.9MB (1018.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [#####---------------]23.51% 95.9MB/407.9MB (1019.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [#####---------------]25.47% 103.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#####---------------]27.43% 111.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [######--------------]29.39% 119.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [######--------------]31.35% 127.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]33.31% 135.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]35.28% 143.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]37.24% 151.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [########------------]39.20% 159.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [########------------]41.16% 167.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]43.12% 175.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]45.08% 183.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]47.04% 191.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##########----------]49.01% 199.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##########----------]50.97% 207.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]52.93% 215.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]54.89% 223.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]56.85% 231.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [############--------]58.81% 239.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [############--------]60.77% 247.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]62.73% 255.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]64.70% 263.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]66.66% 271.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##############------]68.62% 279.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##############------]70.58% 287.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]72.54% 295.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]74.50% 303.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]76.46% 311.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_download_73673569 2021-04-19 16:47:27,810 [client:1868 - DEBUG]: Retrying download on error: [] after progressing 0 bytes Traceback (most recent call last): File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 331, in _error_catcher yield File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 413, in read data = self._fp.read(amt) File "/opt/biomarker/anaconda3/lib/python3.7/http/client.py", line 447, in read n = self.readinto(b) File "/opt/biomarker/anaconda3/lib/python3.7/http/client.py", line 491, in readinto n = self.fp.readinto(b) File "/opt/biomarker/anaconda3/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) File "/opt/biomarker/anaconda3/lib/python3.7/ssl.py", line 1049, in recv_into return self.read(nbytes, buffer) File "/opt/biomarker/anaconda3/lib/python3.7/ssl.py", line 908, in read return self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 753, in generate for chunk in self.raw.stream(chunk_size, decode_content=True): File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 465, in stream data = self.read(amt=amt, decode_content=decode_content) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 430, in read raise IncompleteRead(self._fp_bytes_read, self.length_remaining) File "/opt/biomarker/anaconda3/lib/python3.7/contextlib.py", line 130, in __exit__ self.gen.throw(type, value, traceback) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 349, in _error_catcher raise ProtocolError('Connection broken: %r' % e, e) urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/client.py", line 1851, in _downloadFileHandle expected_md5=fileHandle.get('contentMd5')) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/client.py", line 1892, in _download_from_url_multi_threaded multithread_download.download_file(self, request) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file downloader.download_file(download_request) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file self._write_chunks(request, completed_futures, transfer_status) File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks chunk_data = chunk_response.content File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 831, in content self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b'' File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 756, in generate raise ChunkedEncodingError(e) requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
@paul-shannon Not yet. A resolution for this will be in the 2.4 version release. There is a patch release (2.3.1) to be released shortly but as I do as yet have an explanation for your most last stack (the md5 mismatch error) encountered using this fix I wasn't able to include it in that release.
Hi @jordank , Has your fix - as seen in the docker image synpy-1128 made it into the standard synapse docker image? So that I can switch back to that? Thank you. - Paul
@jordank Another result, very close, but last minute failure: I tried the new docker image ```docker.synapse.org/syn25326461/synpy-1128``` from my home, and my laptop. Though very slow (< 2M/sec) it seemed to run robustly. However, there seems to have been one restart, and then a failure - apparently when comparing MD5s on the dowloaded file. Full trace below. ``` x = syn.get('syn11714133', downloadLocation="/tmp") Downloading [##------------------]12.18% 1.8GB/14.9GB (1.5MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 Downloading [####################]99.97% 14.9GB/14.9GB (1.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 Traceback (most recent call last): File "", line 1, in File "/synapsePythonClient/synapseclient/client.py", line 713, in get return self._getWithEntityBundle(entityBundle=bundle, entity=entity, **kwargs) File "/synapsePythonClient/synapseclient/client.py", line 829, in _getWithEntityBundle self._download_file_entity(downloadLocation, entity, ifcollision, submission) File "/synapsePythonClient/synapseclient/client.py", line 891, in _download_file_entity downloadPath = self._downloadFileHandle(entity.dataFileHandleId, objectId, objectType, downloadPath) File "/synapsePythonClient/synapseclient/client.py", line 1840, in _downloadFileHandle expected_md5=fileHandle.get('contentMd5')) File "/synapsePythonClient/synapseclient/client.py", line 1895, in _download_from_url_multi_threaded filename=temp_destination, md5=actual_md5, expected_md5=expected_md5 synapseclient.core.exceptions.SynapseMd5MismatchError: Downloaded file /tmp/NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232's md5 789f9902c131511ef417139c6de7d78f does not match expected MD5 of d2193b7fd96e3feb02566c578de3a26c ```
Good progress, @jordank. The new client runs about 10-15x faster on our linux server: 45M/sec (new) vs 3-6M/sec (old). And completes the download without error. On my laptop, from home, it runs at 1.5M/s - and 14% complete, has not yet failed and restarted. @abby.vanderlinden - how long did it take for you to download 14.9G?
I ran this on my local machine and the download completed with no issues. Thanks, Jordan!
@paul-shannon I've uploaded the image to the Synapse docker registry. You can download and run the image as follows: ``` # login to synapse docker registry docker login -u docker.synapse.org Password: # pull the image docker pull docker.synapse.org/syn25326461/synpy-1128 # run the image in an interactive shell docker run -ti docker.synapse.org/syn25326461/synpy-1128 /bin/bash ``` Thanks!
@jordank Would it be too much to ask you to create a new docker image with these changes? That would be a easier for me, ensure version conflicts and uncertainties do not interfere.
Thanks @abby.vanderlinden . Interestingly Abby's error indicates a different exact cause (that error seems to be typically associated with an SSL protocol mismatch), whereas the error from Paul's stack suggests one of the download concurrent download connections timing out. I can't directly reproduce or fully explain either of them individually, but seeing two different intermittent underlying causes resulting in an error at the same spot in the code has me thinking about addressing this in a different way. I think that a change in the following test version of Synapse client will make the download more robust to an unexpected error in one of the individual file part downloads. This test version is available from our test.pypi release and can be installed e.g. ``` pip3 install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple "synapseclient>=2.3.1.316" ``` @abby.vanderlinden and/or @paul-shannon Paul, could I impose on you to install the above version and reattempt the download (again, I apologize that I'm unable to reliably reproduce this myself to confirm ahead of time it will fix the issue). Thanks, Jordan
Ok, here's the debug output from my download attempt: synapse --debug get syn11714177 Downloading [--------------------]0.45% 8.0MB/1.7GB (526.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01Downloading [--------------------]0.89% 16.0MB/1.7GB (512.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]1.34% 24.0MB/1.7GB (517.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]1.79% 32.0MB/1.7GB (549.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]2.23% 40.0MB/1.7GB (555.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]2.68% 48.0MB/1.7GB (573.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]3.13% 56.0MB/1.7GB (572.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]3.57% 64.0MB/1.7GB (568.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.02% 72.0MB/1.7GB (581.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.47% 80.0MB/1.7GB (586.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.91% 88.0MB/1.7GB (599.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]5.36% 96.0MB/1.7GB (585.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]5.81% 104.0MB/1.7GB (583.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]6.25% 112.0MB/1.7GB (572.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]6.70% 120.0MB/1.7GB (580.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]7.15% 128.0MB/1.7GB (580.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]7.59% 136.0MB/1.7GB (584.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.04% 144.0MB/1.7GB (599.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.49% 152.0MB/1.7GB (604.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.93% 160.0MB/1.7GB (610.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]9.38% 168.0MB/1.7GB (613.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]9.83% 176.0MB/1.7GB (619.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]10.27% 184.0MB/1.7GB (618.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]10.72% 192.0MB/1.7GB (621.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]11.17% 200.0MB/1.7GB (624.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]11.61% 208.0MB/1.7GB (618.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]12.06% 216.0MB/1.7GB (622.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]12.51% 224.0MB/1.7GB (620.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]12.95% 232.0MB/1.7GB (613.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]13.40% 240.0MB/1.7GB (607.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]13.85% 248.0MB/1.7GB (604.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]14.29% 256.0MB/1.7GB (605.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]14.74% 264.0MB/1.7GB (608.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]15.19% 272.0MB/1.7GB (609.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]15.63% 280.0MB/1.7GB (610.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.08% 288.0MB/1.7GB (607.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.53% 296.0MB/1.7GB (605.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.97% 304.0MB/1.7GB (601.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]17.42% 312.0MB/1.7GB (600.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]17.87% 320.0MB/1.7GB (602.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]18.31% 328.0MB/1.7GB (606.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]18.76% 336.0MB/1.7GB (604.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]19.21% 344.0MB/1.7GB (601.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]19.65% 352.0MB/1.7GB (601.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.10% 360.0MB/1.7GB (602.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.55% 368.0MB/1.7GB (603.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.99% 376.0MB/1.7GB (604.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_Y.recalibrated_variants.vcf(1).gz.synapse_download_74048637 2021-03-23 10:41:46,677 [client:1856 - DEBUG]: Retrying download on error: [] after progressing 0 bytes Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 438, in _error_catcher yield File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 519, in read data = self._fp.read(amt) if not fp_closed else b"" File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 458, in read n = self.readinto(b) File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 502, in readinto n = self.fp.readinto(b) File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1241, in recv_into return self.read(nbytes, buffer) File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1099, in read return self._sslobj.read(len, buffer) ConnectionResetError: [Errno 54] Connection reset by peer During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 753, in generate for chunk in self.raw.stream(chunk_size, decode_content=True): File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 576, in stream data = self.read(amt=amt, decode_content=decode_content) File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 541, in read raise IncompleteRead(self._fp_bytes_read, self.length_remaining) File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 135, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 455, in _error_catcher raise ProtocolError("Connection broken: %r" % e, e) urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/synapseclient/client.py", line 1836, in _downloadFileHandle downloaded_path = self._download_from_url_multi_threaded(fileHandleId, File "/usr/local/lib/python3.9/site-packages/synapseclient/client.py", line 1881, in _download_from_url_multi_threaded multithread_download.download_file(self, request) File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file downloader.download_file(download_request) File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file self._write_chunks(request, completed_futures, transfer_status) File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks chunk_data = chunk_response.content File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 831, in content self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b'' File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 756, in generate raise ChunkedEncodingError(e) requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer'))
Hey Paul and Jordan, I did see the restart error downloading one of the vcfs from this study using the command line client (latest version, Python 3.9.2). I will say that my internet connection where I am right now is not great, which may be part of it on my end. I'm going to try downloading the same file in debug mode today and I'll let you know how it goes.
@jordank Check with Abby. She saw exactly the same failure, with multiple restarts, using the command line client. I don't know about her client-side configuration. Might be a clue there.
A docker container unless otherwise specified will be able to use all the CPUs of the host machine, and so running the synapse python client inside a container would have the same download concurrency as running on the host machine and so could have the same outcome if this is the cause. This theory does not straightforwardly explain why you might be experiencing the same phenomenon in synapser, however.
@jordank I saw the same phenomenon with single-threaded synapser. I see the same phenomenon with your Docker image on my macos laptop, with 4 cores. I am not sure your analysis covers all these circumstances. But I do appreciate your efforts!
@paul-shannon Thanks, the full stack is a bit more illuminating and confirms that the request on which the connection reset occurred was directly between the Python client and AWS S3. Most Synapse stored files, including this one, are stored in S3. Requests to Synapse infrastructure handle the bookkeeping, while the actual bytes are transferred directly between the clients and S3. We don't have much control over how S3 handles it connections (and don't have logs specifically that would explain the reason for S3 to reset a connection). I had initially thought that this was not concurrency related since in your initial post you indicated that this error was also occurring in synapser, and synapser is single threaded (as is R). However given the above stack and the fact that your machine has 88 processors is making me think that is is related to the concurrency of the download after all. The Python client is multi threaded and downloads multiple parts of the requested file concurrently, with the default concurrency scaling according to the number of processors on the machine. A connection reset at this point in the stack might indicate that the client was not able to read bytes from all of its open connections in time before S3 began resetting some of the connections. When you indicated in an above post that your machine had 88 processors, I tried to reproduce this on a EC2 M5.24xlarge with 96 processors, and had no issues, however an EC2 instance might have different performance characteristics given its throughput to S3. It might be that there is some constraint here that is preventing this number of download parts from being served concurrently on your machine (relative bandwidth, ulimit on the number of threads per process or requests, differences in how the threads are scheduled)? If this was the cause, the number concurrency of a download can be explicitly lowered e.g. ``` syn = synapseclient.login() syn.max_threads = 10 x = syn.get('syn11714133', downloadLocation=".") ``` If this is the explanation then a lower concurrency could allow the client to serve all its threads successfully. max_threads can also be set through the [.synapseConfig configuration file](https://python-docs.synapse.org/build/html/news.html#id16).synapsConfig configuration file. Again, I'm not able to repro this, but I think it is the best explanation I can think of right now given the evidence.
@jordank Here's another thought. What do your server logs say? Maybe some insight can be gleaned there.
@jordank By "run a dozen clients" I mean only, "Can you stress test your server? Can you do so (maybe you already are) from a remote host?" My big beefy linux box has 64 cores, copious memory, lots of disk space, and was under very light loads during my test.
## "Connection reset by peer" Which in this case seems to be that your server closed the connection. Perhaps the client has a retry loop? That's what I'd guess - and that the server keeps failing. My diagnosis may be all wrong. - Paul synapsePythonClient> synapse --debug get syn11714133 synapse --debug get syn11714133 Synapse username (leave blank if using an auth token): paul-shannon paul-shannon Password, api key, or auth token for user paul-shannon Welcome, Paul Shannon! 2021-03-23 00:47:03,371 [client:428 - INFO]: Welcome, Paul Shannon! Downloading [####----------------]22.06% 3.3GB/14.9GB (5.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 2021-03-23 00:56:57,986 [client:1857 - DEBUG]: Retrying download on error: [] after progressing 0 bytes Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 438, in _error_catcher yield File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 519, in read data = self._fp.read(amt) if not fp_closed else b"" File "/usr/lib/python3.6/http/client.py", line 463, in read n = self.readinto(b) File "/usr/lib/python3.6/http/client.py", line 507, in readinto n = self.fp.readinto(b) File "/usr/lib/python3.6/socket.py", line 586, in readinto return self._sock.recv_into(b) File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into return self.read(nbytes, buffer) File "/usr/lib/python3.6/ssl.py", line 874, in read return self._sslobj.read(len, buffer) File "/usr/lib/python3.6/ssl.py", line 631, in read v = self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 753, in generate for chunk in self.raw.stream(chunk_size, decode_content=True): File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 576, in stream data = self.read(amt=amt, decode_content=decode_content) File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 541, in read raise IncompleteRead(self._fp_bytes_read, self.length_remaining) File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 455, in _error_catcher raise ProtocolError("Connection broken: %r" % e, e) urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/client.py", line 1840, in _downloadFileHandle expected_md5=fileHandle.get('contentMd5')) File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/client.py", line 1881, in _download_from_url_multi_threaded multithread_download.download_file(self, request) File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file downloader.download_file(download_request) File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file self._write_chunks(request, completed_futures, transfer_status) File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks chunk_data = chunk_response.content File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 831, in content self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b'' File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 756, in generate raise ChunkedEncodingError(e) requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) Downloading [#-------------------]3.24% 496.0MB/14.9GB (6.1MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 data = self._fp.read(amt) if not fp_closed else b"" File "/usr/lib/python3.6/http/client.py", line 463, in read n = self.readinto(b) File "/usr/lib/python3.6/http/client.py", line 507, in readinto n = self.fp.readinto(b) File "/usr/lib/python3.6/socket.py", line 586, in readinto return self._sock.recv_into(b) File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into return self.read(nbytes, buffer) File "/usr/lib/python3.6/ssl.py", line 874, in read return self._sslobj.read(len, buffer) File "/usr/lib/python3.6/ssl.py", line 631, in read v = self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer During handling of the above exception, another exception occurred:
@paul-shannon Can you clarify what you mean by running a dozen clients? Are you also running some other activity that isn't represented by the syn.get download?
@jordank Is there any chance you could fire up a dozen clients yourself, try to reproduce the error on your end? Maybe it is not out of place for me to recall that Abby saw this same problem, thus suggesting it is not specific to my use... I will go ahead with the approach you suggest. Let me know if you will be trying some research also. - Paul
Sorry for all the back and forth, but can you try the command line with debug option download (either in the container or outside since you seem to be getting a similar reset either way). The download starting over indicates that some unexpected exception was encountered forcing the restart. Using the command line option with the debug should cause it to log that exception to the console. ``` synapse --debug get syn11714133 ```
The docker 14.9G vcf.gz download just spontaneously restarted after being about 40% complete. >>> x = syn.get('syn11714133', downloadLocation="/tmp") x = syn.get('syn11714133', downloadLocation="/tmp") Downloading [#-------------------]6.32% 968.0MB/14.9GB (2.3MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 This is with your latest docker image: sagebionetworks/synapsepythonclient latest 5272bfe96931 11 days ago 211MB I cannot be certain, but I have no reason to believe that the ISB's fast internet pipe is at fault.
@jordank I am now 33% complete on a 15G vcf.gz file, so I am optimistic. In case a future reader is curious about the specifics of the docker "syn.get", here are my notes ``` docker run -it \ --entrypoint /bin/bash \ -v /tmp/docker-paul:/tmp \ sagebionetworks/synapsepythonclient --- sample bash and python run within synapsepythonclient container export PS1='DOCKER.\W> ' python3 import synapseclient syn = synapseclient.login("paul-shannon", password="passwordGoesHere")) x = syn.get('syn1899498', downloadLocation="/tmp") # quick and easy ```
I'll give it a try.
@paul-shannon I did not push the image but the Dockerfile and its context files are linked in the [gist](https://gist.github.com/jkiang13/967d8140b32ef25c5718c16c07577b10#file-dockerfile). That image was me attempting to replicate your environment however. You can instead pull the synapseclient docker image on [Docker Hub](https://hub.docker.com/r/sagebionetworks/synapsepythonclient), e.g. ``` docker pull sagebionetworks/synapsepythonclient ``` Running the client from within a container run from that image would be a clean install without any other dependencies and would be a good data point if it reproduces the issue or completes successfully.
@jordank Perhaps a better approach, less fuss, faster resolution, would be for me to use your docker image. Is it publicly available?
@paul-shannon I created the following [Dockerfile](https://gist.github.com/jkiang13/967d8140b32ef25c5718c16c07577b10#file-dockerfile) as an attempt to recreate the environment above as close as I could, but could not replicate the error, which suggests the issue it's not related to an particular interfering dependency or something else environmental that was replicated in the Docker image. Could you try running the download from the command line using the debug option? If the download progress bar is being reset it suggests an intermediate exception that synapse is trying to recover from. Running the command line download with the debug option should log any intermediate errors that caused the exception to the console before restarting the download. e.g. ``` synapse --debug get syn11714133 ```
Here you go. (I looked for a file upload capability here, without success. It'd be a handy feature. What follows is from cut and paste.) - Paul haleesi.vcf> synapse --version Synapse Client 2.3.0 khaleesi.vcf> conda list # packages in environment at /users/pshannon/anaconda3: # # Name Version Build Channel _ipyw_jlab_nb_ext_conf 0.1.0 py37_0 alabaster 0.7.12 py37_0 anaconda 2018.12 py37_0 anaconda-client 1.7.2 py37_0 anaconda-navigator 1.9.6 py37_0 anaconda-project 0.8.2 py37_0 asn1crypto 0.24.0 py37_0 astroid 2.1.0 py37_0 astropy 3.1 py37h7b6447c_0 atomicwrites 1.2.1 py37_0 attrs 18.2.0 py37h28b3542_0 babel 2.6.0 py37_0 backcall 0.1.0 py37_0 backports 1.0 py37_1 backports.os 0.1.1 py37_0 backports.shutil_get_terminal_size 1.0.0 py37_2 beautifulsoup4 4.6.3 py37_0 bitarray 0.8.3 py37h14c3975_0 bkcharts 0.2 py37_0 blas 1.0 mkl blaze 0.11.3 py37_0 bleach 3.0.2 py37_0 blosc 1.14.4 hdbcaa40_0 bokeh 1.0.2 py37_0 boto 2.49.0 py37_0 bottleneck 1.2.1 py37h035aef0_1 bzip2 1.0.6 h14c3975_5 ca-certificates 2018.03.07 0 cairo 1.14.12 h8948797_3 certifi 2018.11.29 py37_0 cffi 1.11.5 py37he75722e_1 chardet 3.0.4 py37_1 click 7.0 py37_0 cloudpickle 0.6.1 py37_0 clyent 1.2.2 py37_1 colorama 0.4.1 py37_0 conda 4.5.12 py37_0 conda-build 3.17.6 py37_0 conda-env 2.6.0 1 conda-verify 3.1.1 py37_0 contextlib2 0.5.5 py37_0 cryptography 2.4.2 py37h1ba5d50_0 curl 7.63.0 hbc83047_1000 cycler 0.10.0 py37_0 cython 0.29.2 py37he6710b0_0 cytoolz 0.9.0.1 py37h14c3975_1 dask 1.0.0 py37_0 dask-core 1.0.0 py37_0 datashape 0.5.4 py37_1 dbus 1.13.2 h714fa37_1 decorator 4.3.0 py37_0 defusedxml 0.5.0 py37_1 Deprecated 1.2.12 distributed 1.25.1 py37_0 docutils 0.14 py37_0 entrypoints 0.2.3 py37_2 et_xmlfile 1.0.1 py37_0 expat 2.2.6 he6710b0_0 fastcache 1.0.2 py37h14c3975_2 filelock 3.0.10 py37_0 flask 1.0.2 py37_1 flask-cors 3.0.7 py37_0 fontconfig 2.13.0 h9420a91_0 freetype 2.9.1 h8a8886c_1 fribidi 1.0.5 h7b6447c_0 future 0.17.1 py37_0 get_terminal_size 1.0.0 haa9412d_0 gevent 1.3.7 py37h7b6447c_1 glib 2.56.2 hd408876_0 glob2 0.6 py37_1 gmp 6.1.2 h6c8ec71_1 gmpy2 2.0.8 py37h10f8cd9_2 graphite2 1.3.12 h23475e2_2 greenlet 0.4.15 py37h7b6447c_0 gst-plugins-base 1.14.0 hbbd80ab_1 gstreamer 1.14.0 hb453b48_1 h5py 2.8.0 py37h989c5e5_3 harfbuzz 1.8.8 hffaf4a1_0 hdf5 1.10.2 hba1933b_1 heapdict 1.0.0 py37_2 html5lib 1.0.1 py37_0 icu 58.2 h9c2bf20_1 idna 2.8 py37_0 imageio 2.4.1 py37_0 imagesize 1.1.0 py37_0 importlib_metadata 0.6 py37_0 intel-openmp 2019.1 144 ipykernel 5.1.0 py37h39e3cac_0 ipython 7.2.0 py37h39e3cac_0 ipython_genutils 0.2.0 py37_0 ipywidgets 7.4.2 py37_0 isort 4.3.4 py37_0 itsdangerous 1.1.0 py37_0 jbig 2.1 hdba287a_0 jdcal 1.4 py37_0 jedi 0.13.2 py37_0 jeepney 0.4 py37_0 jinja2 2.10 py37_0 jpeg 9b h024ee3a_2 jsonschema 2.6.0 py37_0 jupyter 1.0.0 py37_7 jupyter_client 5.2.4 py37_0 jupyter_console 6.0.0 py37_0 jupyter_core 4.4.0 py37_0 jupyterlab 0.35.3 py37_0 jupyterlab_server 0.2.0 py37_0 keyring 12.0.2 keyrings.alt 3.1 kiwisolver 1.0.1 py37hf484d3e_0 krb5 1.16.1 h173b8e3_7 lazy-object-proxy 1.3.1 py37h14c3975_2 libarchive 3.3.3 h5d8350f_5 libcurl 7.63.0 h20c2e04_1000 libedit 3.1.20170329 h6b74fdf_2 libffi 3.2.1 hd88cf55_4 libgcc-ng 8.2.0 hdf63c60_1 libgfortran-ng 7.3.0 hdf63c60_0 liblief 0.9.0 h7725739_1 libpng 1.6.35 hbc83047_0 libsodium 1.0.16 h1bed415_0 libssh2 1.8.0 h1ba5d50_4 libstdcxx-ng 8.2.0 hdf63c60_1 libtiff 4.0.9 he85c1e1_2 libtool 2.4.6 h7b6447c_5 libuuid 1.0.3 h1bed415_2 libxcb 1.13 h1bed415_1 libxml2 2.9.8 h26e45fe_1 libxslt 1.1.32 h1312cb7_0 llvmlite 0.26.0 py37hd408876_0 locket 0.2.0 py37_1 lxml 4.2.5 py37hefd8a0e_0 lz4-c 1.8.1.2 h14c3975_0 lzo 2.10 h49e0be7_2 markupsafe 1.1.0 py37h7b6447c_0 matplotlib 3.0.2 py37h5429711_0 mccabe 0.6.1 py37_1 mistune 0.8.4 py37h7b6447c_0 mkl 2019.1 144 mkl-service 1.1.2 py37he904b0f_5 mkl_fft 1.0.6 py37hd81dba3_0 mkl_random 1.0.2 py37hd81dba3_0 more-itertools 4.3.0 py37_0 mpc 1.1.0 h10f8cd9_1 mpfr 4.0.1 hdf1c602_3 mpmath 1.1.0 py37_0 msgpack-python 0.5.6 py37h6bb024c_1 multipledispatch 0.6.0 py37_0 navigator-updater 0.2.1 py37_0 nbconvert 5.4.0 py37_1 nbformat 4.4.0 py37_0 ncurses 6.1 he6710b0_1 networkx 2.2 py37_1 nltk 3.4 py37_1 nose 1.3.7 py37_2 notebook 5.7.4 py37_0 numba 0.41.0 py37h962f231_0 numexpr 2.6.8 py37h9e4a6bb_0 numpy 1.15.4 py37h7e9f1db_0 numpy-base 1.15.4 py37hde5b4d6_0 numpydoc 0.8.0 py37_0 odo 0.5.1 py37_0 olefile 0.46 py37_0 openpyxl 2.5.12 py37_0 openssl 1.1.1a h7b6447c_0 packaging 18.0 py37_0 pandas 0.23.4 py37h04863e7_0 pandoc 1.19.2.1 hea2e7c5_1 pandocfilters 1.4.2 py37_1 pango 1.42.4 h049681c_0 parso 0.3.1 py37_0 partd 0.3.9 py37_0 patchelf 0.9 he6710b0_3 path.py 11.5.0 py37_0 pathlib2 2.3.3 py37_0 patsy 0.5.1 py37_0 pcre 8.42 h439df22_0 pep8 1.7.1 py37_0 pexpect 4.6.0 py37_0 pickleshare 0.7.5 py37_0 pillow 5.3.0 py37h34e0f95_0 pip 21.0.1 pip 18.1 py37_0 pixman 0.34.0 hceecf20_3 pkginfo 1.4.2 py37_1 pluggy 0.8.0 py37_0 ply 3.11 py37_0 prometheus_client 0.5.0 py37_0 prompt_toolkit 2.0.7 py37_0 psutil 5.4.8 py37h7b6447c_0 ptyprocess 0.6.0 py37_0 py 1.7.0 py37_0 py-lief 0.9.0 py37h7725739_1 pycodestyle 2.4.0 py37_0 pycosat 0.6.3 py37h14c3975_0 pycparser 2.19 py37_0 pycrypto 2.6.1 py37h14c3975_9 pycurl 7.43.0.2 py37h1ba5d50_0 pyflakes 2.0.0 py37_0 pygments 2.3.1 py37_0 pylint 2.2.2 py37_0 pyodbc 4.0.25 py37he6710b0_0 pyopenssl 18.0.0 py37_0 pyparsing 2.3.0 py37_0 pyqt 5.9.2 py37h05f1152_2 pysocks 1.6.8 py37_0 pytables 3.4.4 py37ha205bf6_0 pytest 4.0.2 py37_0 pytest-arraydiff 0.3 py37h39e3cac_0 pytest-astropy 0.5.0 py37_0 pytest-doctestplus 0.2.0 py37_0 pytest-openfiles 0.3.1 py37_0 pytest-remotedata 0.3.1 py37_0 python 3.7.1 h0371630_7 python-dateutil 2.7.5 py37_0 python-libarchive-c 2.8 py37_6 pytz 2018.7 py37_0 pywavelets 1.0.1 py37hdd07704_0 pyyaml 3.13 py37h14c3975_0 pyzmq 17.1.2 py37h14c3975_0 qt 5.9.7 h5867ecd_1 qtawesome 0.5.3 py37_0 qtconsole 4.4.3 py37_0 qtpy 1.5.2 py37_0 readline 7.0 h7b6447c_5 requests 2.25.1 requests 2.21.0 py37_0 rope 0.11.0 py37_0 ruamel_yaml 0.15.46 py37h14c3975_0 scikit-image 0.14.1 py37he6710b0_0 scikit-learn 0.20.1 py37hd81dba3_0 scipy 1.1.0 py37h7c811a0_2 seaborn 0.9.0 py37_0 SecretStorage 2.3.1 secretstorage 3.1.0 py37_0 send2trash 1.5.0 py37_0 setuptools 40.6.3 py37_0 simplegeneric 0.8.1 py37_2 singledispatch 3.4.0.3 py37_0 sip 4.19.8 py37hf484d3e_0 six 1.12.0 py37_0 snappy 1.1.7 hbae5bb6_3 snowballstemmer 1.2.1 py37_0 sortedcollections 1.0.1 py37_0 sortedcontainers 2.1.0 py37_0 sphinx 1.8.2 py37_0 sphinxcontrib 1.0 py37_1 sphinxcontrib-websupport 1.1.0 py37_1 spyder 3.3.2 py37_0 spyder-kernels 0.3.0 py37_0 sqlalchemy 1.2.15 py37h7b6447c_0 sqlite 3.26.0 h7b6447c_0 statsmodels 0.9.0 py37h035aef0_0 sympy 1.3 py37_0 synapseclient 2.3.0 tblib 1.3.2 py37_0 terminado 0.8.1 py37_1 testpath 0.4.2 py37_0 tk 8.6.8 hbc83047_0 toolz 0.9.0 py37_0 tornado 5.1.1 py37h7b6447c_0 tqdm 4.28.1 py37h28b3542_0 traitlets 4.3.2 py37_0 unicodecsv 0.14.1 py37_0 unixodbc 2.3.7 h14c3975_0 urllib3 1.24.1 py37_0 wcwidth 0.1.7 py37_0 webencodings 0.5.1 py37_1 werkzeug 0.14.1 py37_0 wheel 0.32.3 py37_0 widgetsnbextension 3.4.2 py37_0 wrapt 1.10.11 py37h14c3975_2 wurlitzer 1.0.2 py37_0 xlrd 1.2.0 py37_0 xlsxwriter 1.1.2 py37_0 xlwt 1.3.0 py37_0 xz 5.2.4 h14c3975_4 yaml 0.1.7 had09818_2 zeromq 4.2.5 hf484d3e_1 zict 0.1.3 py37_0 zlib 1.2.11 h7b6447c_3 zstd 1.3.7 h0b5b093_0 khaleesi.vcf> pip list Package Version ---------------------------------- ---------- alabaster 0.7.12 anaconda-client 1.7.2 anaconda-navigator 1.9.6 anaconda-project 0.8.2 asn1crypto 0.24.0 astroid 2.1.0 astropy 3.1 atomicwrites 1.2.1 attrs 18.2.0 Babel 2.6.0 backcall 0.1.0 backports.os 0.1.1 backports.shutil-get-terminal-size 1.0.0 beautifulsoup4 4.6.3 bitarray 0.8.3 bkcharts 0.2 blaze 0.11.3 bleach 3.0.2 bokeh 1.0.2 boto 2.49.0 Bottleneck 1.2.1 certifi 2018.11.29 cffi 1.11.5 chardet 3.0.4 Click 7.0 cloudpickle 0.6.1 clyent 1.2.2 colorama 0.4.1 conda 4.5.12 conda-build 3.17.6 conda-verify 3.1.1 contextlib2 0.5.5 cryptography 2.4.2 cycler 0.10.0 Cython 0.29.2 cytoolz 0.9.0.1 dask 1.0.0 datashape 0.5.4 decorator 4.3.0 defusedxml 0.5.0 Deprecated 1.2.12 distributed 1.25.1 docutils 0.14 entrypoints 0.2.3 et-xmlfile 1.0.1 fastcache 1.0.2 filelock 3.0.10 Flask 1.0.2 Flask-AutoIndex 0.6.2 Flask-Cors 3.0.7 Flask-Silk 0.2 future 0.17.1 gevent 1.3.7 glob2 0.6 gmpy2 2.0.8 greenlet 0.4.15 h5py 2.8.0 heapdict 1.0.0 html5lib 1.0.1 idna 2.8 imageio 2.4.1 imagesize 1.1.0 importlib-metadata 0.6 ipykernel 5.1.0 ipython 7.2.0 ipython-genutils 0.2.0 ipywidgets 7.4.2 isort 4.3.4 itsdangerous 1.1.0 jdcal 1.4 jedi 0.13.2 jeepney 0.4 Jinja2 2.10 jsonschema 2.6.0 jupyter 1.0.0 jupyter-client 5.2.4 jupyter-console 6.0.0 jupyter-core 4.4.0 jupyterlab 0.35.3 jupyterlab-server 0.2.0 keyring 12.0.2 keyrings.alt 3.1 kiwisolver 1.0.1 lazy-object-proxy 1.3.1 libarchive-c 2.8 lief 0.9.0 llvmlite 0.26.0 locket 0.2.0 lxml 4.2.5 MarkupSafe 1.1.0 matplotlib 3.0.2 mccabe 0.6.1 mistune 0.8.4 mkl-fft 1.0.6 mkl-random 1.0.2 more-itertools 4.3.0 mpmath 1.1.0 msgpack 0.5.6 multipledispatch 0.6.0 navigator-updater 0.2.1 nbconvert 5.4.0 nbformat 4.4.0 networkx 2.2 nltk 3.4 nose 1.3.7 notebook 5.7.4 numba 0.41.0 numexpr 2.6.8 numpy 1.15.4 numpydoc 0.8.0 odo 0.5.1 olefile 0.46 openpyxl 2.5.12 packaging 18.0 pandas 0.23.4 pandocfilters 1.4.2 parso 0.3.1 partd 0.3.9 path.py 11.5.0 pathlib2 2.3.3 patsy 0.5.1 pep8 1.7.1 pexpect 4.6.0 pickleshare 0.7.5 Pillow 5.3.0 pip 21.0.1 pkginfo 1.4.2 pluggy 0.8.0 ply 3.11 prometheus-client 0.5.0 prompt-toolkit 2.0.7 psutil 5.4.8 ptyprocess 0.6.0 py 1.7.0 pycodestyle 2.4.0 pycosat 0.6.3 pycparser 2.19 pycrypto 2.6.1 pycurl 7.43.0.2 pyflakes 2.0.0 Pygments 2.3.1 pylint 2.2.2 pyodbc 4.0.25 pyOpenSSL 18.0.0 pyparsing 2.3.0 PySocks 1.6.8 pytest 4.0.2 pytest-arraydiff 0.3 pytest-astropy 0.5.0 pytest-doctestplus 0.2.0 pytest-openfiles 0.3.1 pytest-remotedata 0.3.1 python-dateutil 2.7.5 pytz 2018.7 PyWavelets 1.0.1 PyYAML 3.13 pyzmq 17.1.2 QtAwesome 0.5.3 qtconsole 4.4.3 QtPy 1.5.2 requests 2.25.1 rope 0.11.0 ruamel-yaml 0.15.46 scikit-image 0.14.1 scikit-learn 0.20.1 scipy 1.1.0 seaborn 0.9.0 SecretStorage 2.3.1 Send2Trash 1.5.0 setuptools 40.6.3 simplegeneric 0.8.1 singledispatch 3.4.0.3 six 1.12.0 snowballstemmer 1.2.1 sortedcollections 1.0.1 sortedcontainers 2.1.0 Sphinx 1.8.2 sphinxcontrib-websupport 1.1.0 spyder 3.3.2 spyder-kernels 0.3.0 SQLAlchemy 1.2.15 statsmodels 0.9.0 sympy 1.3 synapseclient 2.3.0 tables 3.4.4 tblib 1.3.2 terminado 0.8.1 testpath 0.4.2 toolz 0.9.0 tornado 5.1.1 tqdm 4.28.1 traitlets 4.3.2 unicodecsv 0.14.1 urllib3 1.24.1 wcwidth 0.1.7 webencodings 0.5.1 Werkzeug 0.14.1 wheel 0.32.3 widgetsnbextension 3.4.2 wrapt 1.10.11 wurlitzer 1.0.2 xlrd 1.2.0 XlsxWriter 1.1.2 xlwt 1.3.0 zict 0.1.3 khaleesi.vcf> lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.4 LTS Release: 18.04 Codename: bionic khaleesi.vcf> openssl version OpenSSL 1.1.1a 20 Nov 2018 khaleesi.vcf> df -k . Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/khaleesi--vg-local 8587847648 5342818616 3245029032 63% /local khaleesi.vcf> khaleesi.vcf> khaleesi.vcf> cat /proc/cpuinfo | grep processor | wc -l 88
Hi @paul-shannon, I tried downloading this particular file in a few Ubuntu environments (e.g. a 20.04 container from my home connection and an 18.04 EC2 instance) and was not immediately able to reproduce this, the download completing successfully, e.g. ``` (py371_synapse230) ubuntu@ip-10-11-58-208:~/venvs$ python Python 3.7.1 (default, Mar 20 2021, 18:54:52) [GCC 7.5.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import synapseclient >>> syn = synapseclient.login() Welcome, Jordan K!! >>> x = syn.get('syn11714133', downloadLocation=".") Downloading [####################]100.00% 14.9GB/14.9GB (141.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_22409246 Done... ``` Could you show the output of the following commands so that I can try reproducing this in a more exact environment? ``` synapse --version # if you are running in a conda environment conda list pip list lsb_release -a openssl version df -k . cat /proc/cpuinfo | grep processor | wc -l ``` Older versions of the synapseclient (< 2.2) could exhibit download restarts when the available disk space on the download volume was exhausted, but if you are running the latest version (2.3.0) this should not be the issue. Thanks!
@jordank could you please have a look?

multiple failures downloading 14G vcf file page is loading…