There are a bunch of files of a particular type that I want to download from Synapse. How can I automate this?

Created by Kenneth Daily kdaily
Hi Kenny: Actually the command line client has this functionality built into the `get` command. You can use the flag `-q` to get all the files that match a query. From the command line run: ``` synapse get -q 'select id from file where parentId=="syn7067015" and fileType=="csv"' ``` To get more help on this command you can do: ``` synapse get -h ``` You would only need to resort to using the xargs functionality of there are no annotations that you want to filter on. To see more on this topic see the documentation on [annotation_and_query](http://docs.synapse.org/articles/annotation_and_query.html)
In general, there are two steps that can work using any of the Synapse clients:   1. Perform a Synapse query to get the Synapse ID of all files of interest. For example, filtering on `parentId` will identify all files in a specific `Folder` or `Project`. Adding other restrictions based on annotations that the files have is also possible. For this example, let's assume that that the files of interest have an annotation of `fileType`, and I'm interested in those that are comma-separated, e.g., 'csv'. 2. Iterate over the ID column and use the Synapse client again to `get` all of those files. ### Command line client > *UPDATE*: As noted by @larssono the Synapse command line client has this functionality built in. You can query and download in the same command using the `-q` parameter to `synapse get`:   ``` synapse get -q 'select id from file where parentId=="syn7067015" and fileType=="csv"' ``` I've left the original explanation here for other uses.   If you are using a UNIX-like operating system (Max OSX, Ubuntu, Fedora, CentOS, etc.), this can be accomplished using the [Synapse command line client](http://docs.synapse.org/python/CommandLineClient.html) along with common command line utilities (`xargs` and `tail`).   Here's an example: ``` synapse query 'select id from file where parentId=="syn7067015" and fileType=="csv"' | tail -n +2 | xargs -I{} synapse get {} ```   There are some things to consider. First, our current query system does not allow the `OR` operator, so separate queries will need to be performed. However, more complex `AND` queries can be performed. You could perform separate queries, temporarily store the resulting Synapse IDs to a file, and use this with the second step from above. You can also perform less limited queries and use other command line tools (like `awk` or `grep`) to do filtering.   Second, you can parallelize the downloads using the `-P` parameter to `xargs` but be aware that you may quickly saturate your network connection and cause everything to slow down.   If you want to do a recursive download, see https://www.synapse.org/#!SynapseForum:threadId=504.

Downloading files in batch page is loading…