Introducing Gene Retriever, the first data mining solution to retrieve all genes associated with a list of articles



The life sciences research community often asks: “What genes are referenced in the results of my PubMed search?” Gene Retriever, a collaboration between Acumenta Biotech and Sidra Medicine, answers this critical question by creating gene lists from the results of any PubMed search.

Comprised of software written by the Advanced Application team in Sidra Medicine’s Biomedical Informatics department and the Literature Lab™ database and Gene Thesaurus™, Gene Retriever is an easy-to-use software application for Mac or Windows computers:

Gene Retriever allows users to retrieve all genes associated with a lit of articles.

How it works


Gene Retriever processes a list of PubMed IDs that you submit and produces an analysis of the genes mentioned in the title, text and MeSH tags of each record.

Results are ranked and presented in a spreadsheet to enable quick and comprehensive analyses. Hyperlinks are added within the spreadsheet to enable instant review of the genes or PubMed IDs of interest.


How accurate is Gene Retriever?

Very accurate. At the core of the Literature Lab™ database is the Acumenta Biotech Gene Thesaurus™, a repository of gene, protein and pathway nomenclature gathered from the major genomic databases and human-curated to produce searches with high precision and high recall. The Gene Thesaurus is unique because it is constantly updated on the Literature Lab platform, and human curation resolves ambiguous terminology, alias redundancy and generic terms.


analyzing your PubMed search with Gene Retriever


We recommend tailoring your PubMed search strategy to be specific to your subject of interest. This will facilitate the exploration and the putative interpretation of the results.

After running your PubMed search, click on the “Send to” link and select “File”. This will expand the window - select “PMID List” in the “Format” drop-down menu and click “Create File”. The file of PubMed IDs will be downloaded to a location that you specify on your computer.

Put this file in the Gene Retriever Input File directory, select it and click “RUN”. The analysis will begin immediately and the results will be placed in the Gene Retriever Output File directory. A list of thousands of PMIDs might take a little while to run. You can cursor over the Gene Retriever screen to see that it is at work.

The Literature Lab database starts on January 1, 1990. If your search produces abstracts going back earlier or you just want to see the genes in a specific time-frame, you can add date-range controls to your search using the following addition to your search:

Put your search in parentheses and add the syntax as shown below. You can tailor the dates to meet your interest.

( your search ) AND "1990/01/01"[EDAT] : "2019/03/31"[EDAT]


The large number of factors involved in the pathogenesis of certain diseases makes the identification of a targeted, manageable-sized list of genes involved in these interactions extremely difficult. 

Portion of the immune response network in patients with inflammatory bowel diseases.

Portion of the immune response network in patients with inflammatory bowel diseases.

Gene Retriever helps identify genes that are highly associated with a particular disease, allowing the construction of a gene panel that contains putative disease signatures. These signatures can be used as biological markers of disease to classify patients for personalized care.

how do I get Gene Retriever?

It’s easy. Gene Retriever is provided with any License of the Literature Lab™ database.

Now you can easily have something you have always wanted but couldn’t get: an easy, thorough and accurate retrieval of the genes in the results of your PubMed search.

Click here for more information!