Functional Enrichment API
The Endpoint
The ToppGene API endpoint is at the following URL.
https://toppgene.cchmc.org/API
ToppGene supports both HTTP POST and HTTP PUT methods. Calling these enpoints with the GET method returns an HTML page describing how to use the endpoints.
Sample Enrichment Call
https://toppgene.cchmc.org/API/enrich
This is about the most minimal API call possible.
curl -H 'Content-Type: text/json' -d '{"Genes":[2]}' https://toppgene.cchmc.org/API/enrich
It is a single gene A2M (alpha-2-macroglobulin)
which has Entrez accession ID 2. It assumes you want to run all the categories with default cutoffs and limits. All genes must be converted to a Human Entrez ID before running the enrichment method.
A more complicated example follows. The Type
most be one of the selected
{
"Genes": [1482,4205,2626,9421,9464,6910,6722],
"Categories": [
{
"Type": "GeneOntologyMolecularFunction",
"PValue": 0.05,
"MinGenes": 1,
"MaxGenes": 1500,
"MaxResults": 10,
"Correction": "FDR"
},
{
"Type": "GeneOntologyBiologicalProcess",
"PValue": 0.05,
"MinGenes": 1,
"MaxGenes": 1500,
"MaxResults": 10,
"Correction": "FDR"
},
{
"Type": "GeneOntologyCellularComponent",
"PValue": 0.05,
"MinGenes": 1,
"MaxGenes": 1500,
"MaxResults": 10,
"Correction": "FDR"
}
]
}
The response from the API is a collection containing a list of Annotations
and some additional internal diagnostic information. Most of the information returned should be self-explanatory. PValue
is the uncorrected p-Value
regardless of what cutoff and correction was used in the request. QValueFDRBH
is the corrected p-Value using the False Discovery Rate method described in
Yoav Benjamini, and Yosef Hochberg (1995).. QValueFDRBY
is False Discovery Rate as described in
Yoav Benjamini and Daniel Yekutieli (2001). QValueBonferroni
is a correction method described by C. E. Bonferroni (1936).
TotalGenes
is the number of genes for all annotations in the category. GenesInTerm
is the number of genes in the specific annotation. GenesInQuery
is the number of genes
supplied in the request. GenesInTermInQuery
is the intersection of GenesInTerm
and GenesInQuery
.
{
"Annotations": [
{
"Category": "GeneOntologyMolecularFunction",
"ID": "GO:0008134",
"Name": "transcription factor binding",
"PValue": 2.83896024455281e-11,
"QValueFDRBH": 8.104603920477282e-10,
"QValueFDRBY": 3.5967939150627736e-9,
"QValueBonferroni": 1.3343113149398206e-9,
"TotalGenes": 18661,
"GenesInTerm": 584,
"GenesInQuery": 7,
"GenesInTermInQuery": 7,
"Source": " ",
"URL": " ",
"Genes": [
{
"Entrez": 6722,
"Symbol": "SRF"
},
{
"Entrez": 2626,
"Symbol": "GATA4"
},
{
"Entrez": 9464,
"Symbol": "HAND2"
},
{
"Entrez": 1482,
"Symbol": "NKX2-5"
},
{
"Entrez": 9421,
"Symbol": "HAND1"
},
{
"Entrez": 4205,
"Symbol": "MEF2A"
},
{
"Entrez": 6910,
"Symbol": "TBX5"
}
]
},
{
"Category": "GeneOntologyMolecularFunction",
"ID": "GO:0000977",
"Name": "RNA polymerase II regulatory region sequence-specific DNA binding",
"PValue": 4.8395136150006095e-11,
"QValueFDRBH": 8.104603920477282e-10,
"QValueFDRBY": 3.5967939150627736e-9,
"QValueBonferroni": 2.2745713990502866e-9,
"TotalGenes": 18661,
"GenesInTerm": 630,
"GenesInQuery": 7,
"GenesInTermInQuery": 7,
"Source": " ",
"URL": " ",
"Genes": [
{
"Entrez": 6722,
"Symbol": "SRF"
},
{
"Entrez": 2626,
"Symbol": "GATA4"
},
{
"Entrez": 9464,
"Symbol": "HAND2"
},
{
"Entrez": 1482,
"Symbol": "NKX2-5"
},
{
"Entrez": 9421,
"Symbol": "HAND1"
},
{
"Entrez": 4205,
"Symbol": "MEF2A"
},
{
"Entrez": 6910,
"Symbol": "TBX5"
}
]
}
]}
Normally, results are returned with as little whitespace as possible. Passing pretty=true as a query parameter will indent results.
Alternative Formats
Passing as=xml
as a query parameter will serialize the results as XML.
curl -H 'Content-Type: application/json' -d '{"Genes":[2]}' 'https://toppgene.cchmc.org/API/enrich?as=xml'
Below is typical results.
<EnrichmentResult>
<results>
<result>
<category>GeneOntologyMolecularFunction</category>
<id>GO:0043120</id>
<name>tumor necrosis factor binding</name>
<pValue>0.00010717539252987508</pValue>
<qValueFDR_BH>0</qValueFDR_BH>
<qValueFDR_BY>0</qValueFDR_BY>
<qValueBonferroni>0</qValueBonferroni>
<totalGenes>18661</totalGenes>
<genesInTerm>2</genesInTerm>
<genesInQuery>1</genesInQuery>
<genesInTermInQuery>1</genesInTermInQuery>
<source> </source>
<url></url>
<genes>
<geneId>2</geneId>
<symbol>A2M</symbol>
</genes>
</result>
<result>
<category>GeneOntologyMolecularFunction</category>
<id>GO:0019959</id>
<name>interleukin-8 binding</name>
<pValue>0.0001607630887948125</pValue>
<qValueFDR_BH>0</qValueFDR_BH>
<qValueFDR_BY>0</qValueFDR_BY>
<qValueBonferroni>0</qValueBonferroni>
<totalGenes>18661</totalGenes>
<genesInTerm>3</genesInTerm>
<genesInQuery>1</genesInQuery>
<genesInTermInQuery>1</genesInTermInQuery>
<source> </source>
<url></url>
<genes>
<geneId>2</geneId>
<symbol>A2M</symbol>
</genes>
</result>
</results>
</EnrichmentResult>
You can also call Enrichment using XML
curl -H 'Content-Type: text/xml' -d '<Enrich><genes><gene>2</gene></genes></Enrich>' 'https://toppgene.cchmc.org/API/enrich'
Enumerated Parameter Values
This is a list of the valid feature types (Categories).
GeneOntologyMolecularFunction
GeneOntologyBiologicalProcess
GeneOntologyCellularComponent
HumanPheno
MousePheno
Domain
Pathway
Pubmed
Interaction
Cytoband
TFBS
GeneFamily
Coexpression
CoexpressionAtlas
ToppGene
Computational
MicroRNA
Drug
Disease
This is a list of valid gene accession types. Currently only ENTREZ works. See the Gene Symbol Lookup API for additional tools for conversion of symbols to Entrez.
HGNC
ENSEMBL
ENTREZ
UNIPROT
REFSEQ
This is a list of valid pValue correction methods. The FDR method is the Benjamini and Hochberg method.
none
FDR
Bonferroni
Compression
The API provides limited support for transparent compression (RFC2616) if the client supplies a Accept-Encoding: gzip
header.
Modern implementations of curl
support this with the --compress
option.