We currently have 249,034 runs from the sequence read archive preprocessed and ready for searching. You can retrieve a list of all of the SRA IDs. These IDs were identified by PARTIE as being most likely metagenomic runs.
We have also provided a set of IDs for the TARA Oceans Project and the Human Microbiome Project. These sets are smaller, and thus will process faster.
As you will recall, a project (SRP) has one or more samples; a sample (SRS) has one or more experiments (SRX); and an experiment has one or more runs (SRR), and in searchSRA we use the SRR IDs as the primary key.
We have combined all the SRR IDs that we have available in searchSRA and provide them as a tab-separated text file that connects the runs with 18,599 projects (SRP identifiers) and includes the title and abstract. You should be able to open this file in a Excel, LibreOffice, or a similar spreadsheet program, as well as Python or R.
Of course, you can always find more about the SRA Metadata and Hidden SRA Metadata from our blogs.
We have created some tools to help you analyze the data that you generate at SearchSRA.org.
There are detailed instructions at the Git Repository, but essentially we recommend using the filter_reads.sh script that will do everything for you.
Check the Git Repository, and be sure to cite our work!
Search SRA Gateway User Manual
Gateway Access
Getting Started
Save & Launch an Experiment
Other Gateway Features
What do I do if I only find a few or no results?
If you are performing DNA searches, it maybe that the DNA sequence is too diverged for this to match using bowtie. There are two suggestions:
When I tried this with an example sequence, I got three hits with not very good E-values:
Recall that in searchSRA one of our heuristics is to only search 100,000 sequences. That data set has 10<sup>10</sup> sequences, so if only 3 sequences in 10<sup>10</sup> match, we would not report that in searchSRA any way.
This would suggest solution #1 (using protein searches) would be a better option for searching the entire SRA (which, alas, you can not do at NCBI!)