ToppGene Suite

A one-stop portal for gene list enrichment analysis and candidate gene prioritization
based on functional annotations and protein interactions network

Functional Enrichment API

The Endpoint

The ToppGene API endpoint is at the following URL.

https://toppgene.cchmc.org/API

ToppGene supports both HTTP POST and HTTP PUT methods. Calling these enpoints with the GET method returns an HTML page describing how to use the endpoints.

Sample Enrichment Call

https://toppgene.cchmc.org/API/enrich

This is about the most minimal API call possible.

curl -H 'Content-Type: text/json' -d '{"Genes":[2]}' https://toppgene.cchmc.org/API/enrich

It is a single gene A2M (alpha-2-macroglobulin) which has Entrez accession ID 2. It assumes you want to run all the categories with default cutoffs and limits. All genes must be converted to a Human Entrez ID before running the enrichment method.

A more complicated example follows. The Type most be one of the selected

{
  "Genes": [1482,4205,2626,9421,9464,6910,6722],
  "Categories": [
    {
      "Type": "GeneOntologyMolecularFunction",
      "PValue": 0.05,
      "MinGenes": 1,
      "MaxGenes": 1500,
      "MaxResults": 10,
      "Correction": "FDR"
    },
    {
      "Type": "GeneOntologyBiologicalProcess",
      "PValue": 0.05,
      "MinGenes": 1,
      "MaxGenes": 1500,
      "MaxResults": 10,
      "Correction": "FDR"
    },
    {
      "Type": "GeneOntologyCellularComponent",
      "PValue": 0.05,
      "MinGenes": 1,
      "MaxGenes": 1500,
      "MaxResults": 10,
      "Correction": "FDR"
    }
  ]
}

The response from the API is a collection containing a list of Annotations and some additional internal diagnostic information. Most of the information returned should be self-explanatory. PValue is the uncorrected p-Value regardless of what cutoff and correction was used in the request. QValueFDRBH is the corrected p-Value using the False Discovery Rate method described in Yoav Benjamini, and Yosef Hochberg (1995).. QValueFDRBY is False Discovery Rate as described in Yoav Benjamini and Daniel Yekutieli (2001). QValueBonferroni is a correction method described by C. E. Bonferroni (1936). TotalGenes is the number of genes for all annotations in the category. GenesInTerm is the number of genes in the specific annotation. GenesInQuery is the number of genes supplied in the request. GenesInTermInQuery is the intersection of GenesInTerm and GenesInQuery.

{
 "Annotations": [
  {
   "Category": "GeneOntologyMolecularFunction",
   "ID": "GO:0008134",
   "Name": "transcription factor binding",
   "PValue": 2.83896024455281e-11,
   "QValueFDRBH": 8.104603920477282e-10,
   "QValueFDRBY": 3.5967939150627736e-9,
   "QValueBonferroni": 1.3343113149398206e-9,
   "TotalGenes": 18661,
   "GenesInTerm": 584,
   "GenesInQuery": 7,
   "GenesInTermInQuery": 7,
   "Source": " ",
   "URL": " ",
   "Genes": [
    {
     "Entrez": 6722,
     "Symbol": "SRF"
    },
    {
     "Entrez": 2626,
     "Symbol": "GATA4"
    },
    {
     "Entrez": 9464,
     "Symbol": "HAND2"
    },
    {
     "Entrez": 1482,
     "Symbol": "NKX2-5"
    },
    {
     "Entrez": 9421,
     "Symbol": "HAND1"
    },
    {
     "Entrez": 4205,
     "Symbol": "MEF2A"
    },
    {
     "Entrez": 6910,
     "Symbol": "TBX5"
    }
   ]
  },
  {
   "Category": "GeneOntologyMolecularFunction",
   "ID": "GO:0000977",
   "Name": "RNA polymerase II regulatory region sequence-specific DNA binding",
   "PValue": 4.8395136150006095e-11,
   "QValueFDRBH": 8.104603920477282e-10,
   "QValueFDRBY": 3.5967939150627736e-9,
   "QValueBonferroni": 2.2745713990502866e-9,
   "TotalGenes": 18661,
   "GenesInTerm": 630,
   "GenesInQuery": 7,
   "GenesInTermInQuery": 7,
   "Source": " ",
   "URL": " ",
   "Genes": [
    {
     "Entrez": 6722,
     "Symbol": "SRF"
    },
    {
     "Entrez": 2626,
     "Symbol": "GATA4"
    },
    {
     "Entrez": 9464,
     "Symbol": "HAND2"
    },
    {
     "Entrez": 1482,
     "Symbol": "NKX2-5"
    },
    {
     "Entrez": 9421,
     "Symbol": "HAND1"
    },
    {
     "Entrez": 4205,
     "Symbol": "MEF2A"
    },
    {
     "Entrez": 6910,
     "Symbol": "TBX5"
    }
   ]
  }
]}

Normally, results are returned with as little whitespace as possible. Passing pretty=true as a query parameter will indent results.

Alternative Formats

Passing as=xml as a query parameter will serialize the results as XML.

curl -H 'Content-Type: application/json' -d '{"Genes":[2]}' 'https://toppgene.cchmc.org/API/enrich?as=xml'

Below is typical results.

<EnrichmentResult>
 <results>
  <result>
   <category>GeneOntologyMolecularFunction</category>
   <id>GO:0043120</id>
   <name>tumor necrosis factor binding</name>
   <pValue>0.00010717539252987508</pValue>
   <qValueFDR_BH>0</qValueFDR_BH>
   <qValueFDR_BY>0</qValueFDR_BY>
   <qValueBonferroni>0</qValueBonferroni>
   <totalGenes>18661</totalGenes>
   <genesInTerm>2</genesInTerm>
   <genesInQuery>1</genesInQuery>
   <genesInTermInQuery>1</genesInTermInQuery>
   <source> </source>
   <url></url>
   <genes>
    <geneId>2</geneId>
    <symbol>A2M</symbol>
   </genes>
  </result>
  <result>
   <category>GeneOntologyMolecularFunction</category>
   <id>GO:0019959</id>
   <name>interleukin-8 binding</name>
   <pValue>0.0001607630887948125</pValue>
   <qValueFDR_BH>0</qValueFDR_BH>
   <qValueFDR_BY>0</qValueFDR_BY>
   <qValueBonferroni>0</qValueBonferroni>
   <totalGenes>18661</totalGenes>
   <genesInTerm>3</genesInTerm>
   <genesInQuery>1</genesInQuery>
   <genesInTermInQuery>1</genesInTermInQuery>
   <source> </source>
   <url></url>
   <genes>
    <geneId>2</geneId>
    <symbol>A2M</symbol>
   </genes>
  </result>
 </results>
</EnrichmentResult>

You can also call Enrichment using XML

curl -H 'Content-Type: text/xml' -d '<Enrich><genes><gene>2</gene></genes></Enrich>' 'https://toppgene.cchmc.org/API/enrich'

Enumerated Parameter Values

This is a list of the valid feature types (Categories).

GeneOntologyMolecularFunction
GeneOntologyBiologicalProcess
GeneOntologyCellularComponent
HumanPheno
MousePheno
Domain
Pathway
Pubmed
Interaction
Cytoband
TFBS
GeneFamily
Coexpression
CoexpressionAtlas
ToppGene
Computational
MicroRNA
Drug
Disease

This is a list of valid gene accession types. Currently only ENTREZ works. See the Gene Symbol Lookup API for additional tools for conversion of symbols to Entrez.

HGNC
ENSEMBL
ENTREZ
UNIPROT
REFSEQ

This is a list of valid pValue correction methods. The FDR method is the Benjamini and Hochberg method.

none
FDR
Bonferroni

Compression

The API provides limited support for transparent compression (RFC2616) if the client supplies a Accept-Encoding: gzip header.

Modern implementations of curl support this with the --compress option.