Reactome is a free, curated database of human pathways and molecular interactions. It provides REST APIs (Content Service and Analysis Service) and a Python client to support pathway retrieval and analyses.

How do I perform overrepresentation analysis?

Submit a gene/protein list to the Analysis Service's identifiers endpoint to obtain enriched pathways, along with statistical metrics and p-values. Interpret results to highlight biologically relevant pathways.

Do I need to install reactome2py?

Reactome2py is a Python client that wraps API calls for easier access. Install it via pip, but be aware it may be less actively maintained than direct REST calls; use direct endpoints for the latest features if needed.

reactome-database

Scanned

npx machina-cli add skill Microck/ordinary-claude-skills/reactome-database --openclaw

Files (1)

SKILL.md

7.6 KB

Reactome Database

Overview

Reactome is a free, open-source, curated pathway database with 2,825+ human pathways. Query biological pathways, perform overrepresentation and expression analysis, map genes to pathways, explore molecular interactions via REST API and Python client for systems biology research.

When to Use This Skill

This skill should be used when:

Performing pathway enrichment analysis on gene or protein lists
Analyzing gene expression data to identify relevant biological pathways
Querying specific pathway information, reactions, or molecular interactions
Mapping genes or proteins to biological pathways and processes
Exploring disease-related pathways and mechanisms
Visualizing analysis results in the Reactome Pathway Browser
Conducting comparative pathway analysis across species

Core Capabilities

Reactome provides two main API services and a Python client library:

1. Content Service - Data Retrieval

Query and retrieve biological pathway data, molecular interactions, and entity information.

Common operations:

Retrieve pathway information and hierarchies
Query specific entities (proteins, reactions, complexes)
Get participating molecules in pathways
Access database version and metadata
Explore pathway compartments and locations

API Base URL: https://reactome.org/ContentService

2. Analysis Service - Pathway Analysis

Perform computational analysis on gene lists and expression data.

Analysis types:

Overrepresentation Analysis: Identify statistically significant pathways from gene/protein lists
Expression Data Analysis: Analyze gene expression datasets to find relevant pathways
Species Comparison: Compare pathway data across different organisms

API Base URL: https://reactome.org/AnalysisService

3. reactome2py Python Package

Python client library that wraps Reactome API calls for easier programmatic access.

Installation:

uv pip install reactome2py

Note: The reactome2py package (version 3.0.0, released January 2021) is functional but not actively maintained. For the most up-to-date functionality, consider using direct REST API calls.

Querying Pathway Data

Using Content Service REST API

The Content Service uses REST protocol and returns data in JSON or plain text formats.

Get database version:

import requests

response = requests.get("https://reactome.org/ContentService/data/database/version")
version = response.text
print(f"Reactome version: {version}")

Query a specific entity:

import requests

entity_id = "R-HSA-69278"  # Example pathway ID
response = requests.get(f"https://reactome.org/ContentService/data/query/{entity_id}")
data = response.json()

Get participating molecules in a pathway:

import requests

event_id = "R-HSA-69278"
response = requests.get(
    f"https://reactome.org/ContentService/data/event/{event_id}/participatingPhysicalEntities"
)
molecules = response.json()

Using reactome2py Package

import reactome2py
from reactome2py import content

# Query pathway information
pathway_info = content.query_by_id("R-HSA-69278")

# Get database version
version = content.get_database_version()

For detailed API endpoints and parameters, refer to references/api_reference.md in this skill.

Performing Pathway Analysis

Overrepresentation Analysis

Submit a list of gene/protein identifiers to find enriched pathways.

Using REST API:

import requests

# Prepare identifier list
identifiers = ["TP53", "BRCA1", "EGFR", "MYC"]
data = "\n".join(identifiers)

# Submit analysis
response = requests.post(
    "https://reactome.org/AnalysisService/identifiers/",
    headers={"Content-Type": "text/plain"},
    data=data
)

result = response.json()
token = result["summary"]["token"]  # Save token to retrieve results later

# Access pathways
for pathway in result["pathways"]:
    print(f"{pathway['stId']}: {pathway['name']} (p-value: {pathway['entities']['pValue']})")

Retrieve analysis by token:

# Token is valid for 7 days
response = requests.get(f"https://reactome.org/AnalysisService/token/{token}")
results = response.json()

Expression Data Analysis

Analyze gene expression datasets with quantitative values.

Input format (TSV with header starting with #):

#Gene	Sample1	Sample2	Sample3
TP53	2.5	3.1	2.8
BRCA1	1.2	1.5	1.3
EGFR	4.5	4.2	4.8

Submit expression data:

import requests

# Read TSV file
with open("expression_data.tsv", "r") as f:
    data = f.read()

response = requests.post(
    "https://reactome.org/AnalysisService/identifiers/",
    headers={"Content-Type": "text/plain"},
    data=data
)

result = response.json()

Species Projection

Map identifiers to human pathways exclusively using the /projection/ endpoint:

response = requests.post(
    "https://reactome.org/AnalysisService/identifiers/projection/",
    headers={"Content-Type": "text/plain"},
    data=data
)

Visualizing Results

Analysis results can be visualized in the Reactome Pathway Browser by constructing URLs with the analysis token:

token = result["summary"]["token"]
pathway_id = "R-HSA-69278"
url = f"https://reactome.org/PathwayBrowser/#{pathway_id}&DTAB=AN&ANALYSIS={token}"
print(f"View results: {url}")

Working with Analysis Tokens

Analysis tokens are valid for 7 days
Tokens allow retrieval of previously computed results without re-submission
Store tokens to access results across sessions
Use GET /token/{TOKEN} endpoint to retrieve results

Data Formats and Identifiers

Supported Identifier Types

Reactome accepts various identifier formats:

UniProt accessions (e.g., P04637)
Gene symbols (e.g., TP53)
Ensembl IDs (e.g., ENSG00000141510)
EntrezGene IDs (e.g., 7157)
ChEBI IDs for small molecules

The system automatically detects identifier types.

Input Format Requirements

For overrepresentation analysis:

Plain text list of identifiers (one per line)
OR single column in TSV format

For expression analysis:

TSV format with mandatory header row starting with "#"
Column 1: identifiers
Columns 2+: numeric expression values
Use period (.) as decimal separator

Output Format

All API responses return JSON containing:

pathways: Array of enriched pathways with statistical metrics
summary: Analysis metadata and token
entities: Matched and unmapped identifiers
Statistical values: pValue, FDR (false discovery rate)

Helper Scripts

This skill includes scripts/reactome_query.py, a helper script for common Reactome operations:

# Query pathway information
python scripts/reactome_query.py query R-HSA-69278

# Perform overrepresentation analysis
python scripts/reactome_query.py analyze gene_list.txt

# Get database version
python scripts/reactome_query.py version

Additional Resources

API Documentation: https://reactome.org/dev
User Guide: https://reactome.org/userguide
Documentation Portal: https://reactome.org/documentation
Data Downloads: https://reactome.org/download-data
reactome2py Docs: https://reactome.github.io/reactome2py/

For comprehensive API endpoint documentation, see references/api_reference.md in this skill.

Current Database Statistics (Version 94, September 2025)

2,825 human pathways
16,002 reactions
11,630 proteins
2,176 small molecules
1,070 drugs
41,373 literature references

Source

git clone https://github.com/Microck/ordinary-claude-skills/blob/main/skills_all/claude-scientific-skills/scientific-skills/reactome-database/SKILL.md

View on GitHub

Overview

Reactome provides free, curated human pathways and tools to query data, run enrichment and expression analyses, map genes to pathways, and explore molecular interactions. Using the Content Service and Analysis Service via REST (and the reactome2py Python client), researchers can perform pathway-centric analyses for systems biology studies.

How This Skill Works

Two main services power the skill: Content Service for data retrieval (pathways, reactions, interactions) and Analysis Service for pathway analyses (overrepresentation and expression data). The reactome2py Python package wraps these calls for easier use. Start by querying data with the Content Service endpoints, then run enrichment or expression analyses with the Analysis Service, and optionally visualize results in the Reactome Pathway Browser.

When to Use It

Perform pathway enrichment analysis on gene or protein lists.
Analyze gene expression data to identify relevant biological pathways.
Query specific pathway information, reactions, or molecular interactions.
Map genes or proteins to biological pathways and processes.
Explore disease-related pathways and mechanisms, and compare pathways across species.

Quick Start

Step 1: Choose the API (Content Service for data retrieval or Analysis Service for enrichment/expression analyses) at https://reactome.org/ContentService or https://reactome.org/AnalysisService.
Step 2: Submit your gene/protein identifiers (one per line) for mapping or run an overrepresentation analysis against a gene list.
Step 3: Retrieve and interpret results, then visualize in the Pathway Browser or via the reactome2py client for further scripting.

Best Practices

Use consistent gene/protein identifiers (e.g., HGNC symbols, Entrez IDs, UniProt IDs) when submitting lists.
Choose the appropriate analysis type (Overrepresentation vs. Expression Data) based on your data and goals.
Fetch the database version before analysis to ensure reproducibility and proper interpretation.
Handle API limits and pagination when retrieving large pathway or molecule lists.
Leverage reactome2py for streamlined calls but cross-check results with direct REST endpoints for edge cases.

Example Use Cases

Enrich a cancer-related gene list to identify top Reactome pathways involved in tumor biology.
Map a set of differentially expressed genes from RNA-seq to enriched pathways and visualize results.
Query a specific pathway’s participating molecules and reactions to study mechanism details.
Perform species comparison to see how pathway involvement differs between human and mouse.
Export results and open them in the Reactome Pathway Browser for interactive exploration.

Frequently Asked Questions

Add this skill to your agents

reactome-database

Reactome Database

Overview

When to Use This Skill

Core Capabilities

1. Content Service - Data Retrieval

2. Analysis Service - Pathway Analysis

3. reactome2py Python Package

Querying Pathway Data

Using Content Service REST API

Using reactome2py Package

Performing Pathway Analysis

Overrepresentation Analysis

Expression Data Analysis

Species Projection

Visualizing Results

Working with Analysis Tokens

Data Formats and Identifiers

Supported Identifier Types

Input Format Requirements

Output Format

Helper Scripts

Additional Resources

Current Database Statistics (Version 94, September 2025)

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is Reactome?

How do I perform overrepresentation analysis?

Do I need to install reactome2py?