# Ensembl Genomics Scraper (Genes, Variants, Sequences) (`parseforge/ensembl-genomics-scraper`) Actor

Query the Ensembl genome reference for 200+ species. Look up genes by symbol or stable ID, list features in a genomic region, fetch DNA sequence, or resolve human variants (rsIDs). Returns biotype, coordinates, transcript IDs, descriptions, and assembly metadata.

- **URL**: https://apify.com/parseforge/ensembl-genomics-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Developer tools, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $18.00 / 1,000 result items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🧬 Ensembl Genomics Scraper

> 🚀 **Export genes, variants, and DNA sequences in seconds.** Look up by gene symbol, stable ID, chromosomal region, or human rsID across **20+ species**. Returns biotype, coordinates, transcript IDs, sequence, allele frequencies, and assembly metadata.

> 🕒 **Last updated:** 2026-05-23 · **📊 30 fields** per record · **🧬 20+ species** · **🔁 5 modes** · **🧫 Ensembl genome reference**

The **Ensembl Genomics Scraper** queries the public Ensembl genome reference, the de facto open browser for vertebrate, model-organism, and select non-vertebrate genomes. It returns **up to 30 structured fields per record**, including stable ID, display name, object type, biotype, species, chromosome, start, end, strand, assembly, description, canonical transcript, source, logic name, molecule type, sequence length and sequence, variant name, variant class, minor allele and frequency, ancestral allele, allele string, most-severe consequence, mappings, evidence, synonyms, mode, query, and the scrape timestamp.

The catalog spans **20+ reference species** including human, mouse, rat, zebrafish, fruit fly, roundworm, baker's yeast, thale cress, chicken, pig, cow, dog, cat, horse, sheep, rhesus macaque, chimpanzee, western clawed frog, medaka, and mosquito. This Actor returns gene lookups, region overlaps, sequence fetches, and human variant resolutions in one run.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Bioinformaticians, pharma research, genetics labs, academic researchers, computational biology students, biotech startups, precision-medicine teams | Gene annotation pipelines, variant impact analysis, comparative genomics, target identification, rsID resolution for GWAS, sequence retrieval for primer design |

---

### 📋 What the Ensembl Genomics Scraper does

Five query workflows in a single Actor:

- 🧬 **Lookup by gene symbol.** Resolve `BRCA2`, `TP53`, `EGFR`, etc. to Ensembl stable IDs, coordinates, biotype, and canonical transcript.
- 🆔 **Lookup by stable ID.** Pass `ENSG00000139618` or `ENST00000380152` for any Ensembl-supported species.
- 🗺️ **Overlap region.** Return all gene features inside `chromosome:start-end` (e.g. `7:140424943-140624564`).
- 🧪 **Sequence by ID.** Fetch the raw DNA, cDNA, or protein sequence for any Ensembl stable ID.
- 🧬 **Variation by rsID.** Resolve human dbSNP rsIDs (e.g. `rs56116432`, `rs1042522`) to allele frequencies, consequences, and ancestral alleles.

Each record bundles the relevant Ensembl-native fields, the species, the mode used, the original query string, and a collection timestamp.

> 💡 **Why it matters:** the Ensembl genome browser is the most widely cited open genome reference in life sciences. Hand-coding a REST client means handling rate limits, schema-per-endpoint quirks, and pagination. This Actor delivers consistent records you can pipe straight into BI tools, notebooks, or pipelines.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td>maxItems</td><td>integer</td><td>10</td><td>Records to return. Free plan caps at 10, paid plan at 1,000,000.</td></tr>
<tr><td>mode</td><td>string</td><td>"lookupSymbol"</td><td>One of lookupSymbol, lookupId, overlapRegion, sequence, variation.</td></tr>
<tr><td>species</td><td>string</td><td>"homo_sapiens"</td><td>Ensembl species slug. Used by symbol, region, sequence modes.</td></tr>
<tr><td>symbols</td><td>array</td><td>["BRCA2","TP53","EGFR","MYC","KRAS"]</td><td>Gene symbols for lookupSymbol mode.</td></tr>
<tr><td>stableIds</td><td>array</td><td>[]</td><td>Ensembl stable IDs for lookupId or sequence.</td></tr>
<tr><td>region</td><td>string</td><td>""</td><td>Genomic region chr:start-end for overlapRegion.</td></tr>
<tr><td>rsids</td><td>array</td><td>[]</td><td>dbSNP rsIDs for variation mode (human-only).</td></tr>
</tbody>
</table>

**Example: human cancer-gene panel.**

```json
{
    "maxItems": 5,
    "mode": "lookupSymbol",
    "species": "homo_sapiens",
    "symbols": ["BRCA1", "BRCA2", "TP53", "EGFR", "KRAS"]
}
````

**Example: all genes overlapping the BRCA2 locus.**

```json
{
    "maxItems": 100,
    "mode": "overlapRegion",
    "species": "homo_sapiens",
    "region": "13:32315086-32400266"
}
```

**Example: resolve TP53 missense variant rsIDs.**

```json
{
    "maxItems": 10,
    "mode": "variation",
    "rsids": ["rs1042522", "rs56116432", "rs17878362"]
}
```

> ⚠️ **Good to Know:** the Ensembl species slug follows the `genus_species` convention (e.g. `homo_sapiens`, `mus_musculus`). The `variation` mode is human-only (dbSNP). For coordinate-based queries, regions must follow `chr:start-end` with assembly coordinates matching the current Ensembl release for that species.

***

### 📊 Output

Each record contains **up to 30 fields** depending on the mode. Download the dataset as CSV, Excel, JSON, or XML.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 🆔 `stableId` | string | `"ENSG00000139618"` |
| 🏷️ `displayName` | string | `"BRCA2"` |
| 🔧 `objectType` | string | `"Gene"` |
| 🧬 `biotype` | string | `"protein_coding"` |
| 🐾 `species` | string | `"homo_sapiens"` |
| 🧭 `chromosome` | string | `"13"` |
| ▶️ `start` | number | `32315086` |
| ⏹️ `end` | number | `32400266` |
| ↔️ `strand` | number | `1` |
| 📐 `assemblyName` | string | `"GRCh38"` |
| 📝 `description` | string | null | `"BRCA2 DNA repair associated"` |
| 🧾 `canonicalTranscript` | string | `"ENST00000380152.8"` |
| 🏛️ `source` | string | `"ensembl_havana"` |
| 🧠 `logicName` | string | `"ensembl_havana_gene_homo_sapiens"` |
| 🧪 `molecule` | string | `"dna"` |
| 📏 `sequenceLength` | number | `84981` |
| 🧬 `sequence` | string | `"ATG..."` |
| 🆔 `variantName` | string | `"rs1042522"` |
| 🏷️ `varClass` | string | `"SNP"` |
| 🔡 `minorAllele` | string | `"C"` |
| 📊 `minorAlleleFreq` | number | `0.3401` |
| 🌳 `ancestralAllele` | string | `"G"` |
| 🧬 `alleleString` | string | `"C/G"` |
| ⚠️ `mostSevereConsequence` | string | `"missense_variant"` |
| 🗺️ `mappings` | array | `[ ... ]` |
| 🔬 `evidence` | array | `["Frequency","1000Genomes"]` |
| 🔗 `synonyms` | array | `["NM_000546.6:c.215C>G"]` |
| 🔧 `mode` | string | `"lookupSymbol"` |
| 🔎 `query` | string | `"BRCA2"` |
| 🕒 `scrapedAt` | ISO 8601 | `"2026-05-23T10:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>🧬 BRCA2 gene lookup</strong></summary>

```json
{
    "stableId": "ENSG00000139618",
    "displayName": "BRCA2",
    "objectType": "Gene",
    "biotype": "protein_coding",
    "species": "homo_sapiens",
    "chromosome": "13",
    "start": 32315086,
    "end": 32400266,
    "strand": 1,
    "assemblyName": "GRCh38",
    "description": "BRCA2 DNA repair associated",
    "canonicalTranscript": "ENST00000380152.8",
    "source": "ensembl_havana",
    "mode": "lookupSymbol",
    "query": "BRCA2",
    "scrapedAt": "2026-05-23T10:00:00.000Z"
}
```

</details>

<details>
<summary><strong>🧪 Variation: rs1042522 (TP53 P72R)</strong></summary>

```json
{
    "variantName": "rs1042522",
    "varClass": "SNP",
    "minorAllele": "C",
    "minorAlleleFreq": 0.3401,
    "ancestralAllele": "G",
    "alleleString": "C/G",
    "mostSevereConsequence": "missense_variant",
    "synonyms": ["NM_000546.6:c.215C>G"],
    "evidence": ["Frequency","1000Genomes","ESP","ExAC","TOPMed"],
    "mode": "variation",
    "query": "rs1042522",
    "scrapedAt": "2026-05-23T10:00:00.000Z"
}
```

</details>

<details>
<summary><strong>🧫 Sequence fetch by stable ID</strong></summary>

```json
{
    "stableId": "ENSG00000139618",
    "molecule": "dna",
    "sequenceLength": 84981,
    "sequence": "ATGCCT...",
    "mode": "sequence",
    "query": "ENSG00000139618",
    "scrapedAt": "2026-05-23T10:00:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 🧬 | **20+ reference species.** Human, mouse, rat, zebrafish, fly, worm, yeast, arabidopsis, and more. |
| 🔁 | **Five modes in one Actor.** Symbol lookup, ID lookup, region overlap, sequence fetch, and variant resolution. |
| 🆔 | **dbSNP rsID resolution.** Human variants returned with MAF, ancestral allele, consequence, evidence. |
| 🗺️ | **Region-based queries.** Pull all gene features inside any chromosomal interval. |
| 🧪 | **Raw sequence retrieval.** DNA, cDNA, or protein, by Ensembl stable ID. |
| 🚫 | **No authentication.** Works against the public Ensembl reference. No login or API key needed. |
| 🔁 | **Always fresh.** Each run pulls the live reference, reflecting the latest Ensembl release. |

> 📊 The Ensembl reference underpins thousands of life-science publications and GWAS pipelines worldwide.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Setup |
|---|---|---|---|---|
| **⭐ Ensembl Genomics Scraper** *(this Actor)* | $5 free credit, then pay-per-use | **20+ species, 5 modes** | **Live per run** | ⚡ 2 min |
| Hand-written Ensembl REST client | Free + engineering | Same | Build it yourself | 🛠️ Hours |
| Commercial bio-databases | $$$$ | Same + curation | Real-time | ⏳ Procurement |
| Hard-coded gene tables | Free | One snapshot | Manual | 🐢 Tech debt |

Pick this Actor when you want consistent Ensembl records without writing and maintaining a REST client.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Ensembl Genomics Scraper page on the Apify Store.
3. 🎯 **Set input.** Pick a mode, a species, and a query payload (symbols, stable IDs, region, or rsIDs). Set `maxItems`.
4. 🚀 **Run it.** Click **Start** and let the Actor collect your data.
5. 📥 **Download.** Grab your results in the **Dataset** tab as CSV, Excel, JSON, or XML.

> ⏱️ Total time from signup to downloaded dataset: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 💊 Pharma & Drug Discovery

- Target identification by gene panel
- Variant impact triage in pipelines
- Comparative genomics across model organisms
- Pre-clinical species selection workflows

</td>
<td width="50%" valign="top">

#### 🧪 Clinical Genomics & Diagnostics

- rsID-to-consequence lookups for GWAS
- Variant interpretation pipelines
- Reference gene annotation for sequencing reports
- Coordinate liftover validation

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 🌱 Agricultural Genomics

- Crop and livestock breeding gene catalogs
- Trait-associated marker discovery
- Comparative analysis (cow, pig, chicken, sheep)
- Genome-assembly QC

</td>
<td width="50%" valign="top">

#### 🧫 Biotech R\&D

- Primer design from raw DNA sequence
- CRISPR guide design pipelines
- Synthetic biology target sourcing
- Orthology mapping across model species

</td>
</tr>
</table>

***

### 🔌 Automating Ensembl Genomics Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

- 🟢 **Node.js.** Install the `apify-client` NPM package.
- 🐍 **Python.** Use the `apify-client` PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

The [Apify Schedules feature](https://docs.apify.com/platform/schedules) lets you trigger this Actor on any cron interval. Hook a webhook to a Slack channel for alerting when a panel of variants flips consequence in a new Ensembl release.

***

### 🌟 Beyond business use cases

Open genome data powers more than commercial R\&D. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Reproducible variant interpretation datasets
- Comparative genomics coursework
- Open-data thesis projects
- Cross-species ortholog studies

</td>
<td width="50%">

#### 🎨 Personal and creative

- Personal-genome interpretation hobby projects
- Citizen-science genealogy and ancestry tools
- Educational visualizations of gene structure
- Bioinformatics learning portfolios

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Rare-disease research collectives
- Patient-advocacy variant dashboards
- Open biomedical-data initiatives
- Public-health surveillance pipelines

</td>
<td width="50%">

#### 🧪 Experimentation

- Train variant-effect prediction models
- Prototype gene-annotation AI agents
- Build genomics-aware chatbots
- Test bioinformatics pipelines on real records

</td>
</tr>
</table>

***

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Ensembl%20Genomics%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Ensembl%20Genomics%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Ensembl%20Genomics%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Ensembl%20Genomics%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

***

### ❓ Frequently Asked Questions

#### 🧩 How does it work?

Pick a mode, a species, and a query payload. The Actor reads the public Ensembl reference and emits a clean structured record per gene, region, sequence, or variant.

#### 📏 How accurate is the data?

Ensembl is the de facto open genome browser, curated by EMBL-EBI and the Wellcome Sanger Institute. The reference is updated several times per year. For clinical reporting always cross-check against the latest release notes.

#### 🔁 How often is the dataset refreshed?

Ensembl publishes major releases roughly every two months and patch updates more frequently. Every run of this Actor pulls the live reference.

#### 🐾 Which species are supported?

20+ reference species including human, mouse, rat, zebrafish, fruit fly, roundworm, baker's yeast, thale cress, chicken, pig, cow, dog, cat, horse, sheep, rhesus macaque, chimpanzee, western clawed frog, medaka, and mosquito.

#### 🧬 Which variant set is supported?

dbSNP rsIDs for human. Other species are supported for gene lookups, region overlaps, and sequence retrieval.

#### ⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval. A weekly run is enough to track inter-release annotation drift.

#### ⚖️ Is this data legal to use?

Ensembl is published as open data under standard academic licenses. Commercial use is permitted; check the source for any attribution preferences.

#### 💼 Can I use this data commercially?

Yes. The Ensembl reference is openly licensed for commercial reuse with attribution to EMBL-EBI.

#### 💳 Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small queries (10 records per run). A paid plan lifts the limit and unlocks scheduling and higher concurrency.

#### 🔧 What if a stable ID is from an older Ensembl release?

The Actor uses the current Ensembl reference. Deprecated stable IDs return an error record with a clear message; use the Ensembl ID history view to resolve to the current ID.

#### 🆘 What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.

***

### 🔌 Integrate with any app

Ensembl Genomics Scraper connects to any cloud service via [Apify integrations](https://apify.com/integrations):

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate multi-step workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect with 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Get run notifications in your channels
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Pipe gene and variant records into your warehouse
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger runs from commits and releases
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export datasets straight to Sheets

You can also use webhooks to push fresh variant annotations into a Notion knowledge base or alert on gene-panel changes.

***

### 🔗 Recommended Actors

- [**🧪 KEGG Pathways Scraper**](https://apify.com/parseforge/kegg-pathways-scraper) - Biochemical pathways and orthologies
- [**📚 ArXiv Scraper**](https://apify.com/parseforge/arxiv-scraper) - Pre-print research papers
- [**🔬 Figshare Scraper**](https://apify.com/parseforge/figshare-scraper) - Open scientific datasets and supplementary files
- [**🧬 ClinicalTrials.gov Scraper**](https://apify.com/parseforge/clinicaltrials-gov-scraper) - U.S. clinical trial registry
- [**📊 GBIF Biodiversity Scraper**](https://apify.com/parseforge/gbif-biodiversity-scraper) - Global biodiversity occurrence records

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more open-science scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Ensembl, EMBL-EBI, the Wellcome Sanger Institute, or NCBI/dbSNP. All trademarks mentioned are the property of their respective owners. Only publicly available open genome reference data is collected.

# Actor input Schema

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `mode` (type: `string`):

What to query: lookup a gene by symbol, lookup by stable ID, list overlapping gene features in a chromosomal region, fetch DNA sequence by ID, or fetch a human variant by rsID.

## `species` (type: `string`):

Reference species (Ensembl convention: genus\_species). Used by lookupSymbol, overlapRegion, sequence modes. Ignored by lookupId and variation modes.

## `symbols` (type: `array`):

Gene symbols for lookupSymbol mode (e.g. BRCA2, TP53, EGFR).

## `stableIds` (type: `array`):

Ensembl stable IDs for lookupId or sequence modes (e.g. ENSG00000139618, ENST00000380152).

## `region` (type: `string`):

Genomic region for overlapRegion mode, format `chromosome:start-end` (e.g. 7:140424943-140624564, X:1000000-2000000). Returns all gene features in that interval.

## `rsids` (type: `array`):

dbSNP rsIDs for variation mode (e.g. rs56116432, rs1042522). Human-only.

## Actor input object example

```json
{
  "maxItems": 10,
  "mode": "lookupSymbol",
  "species": "homo_sapiens",
  "symbols": [
    "BRCA2",
    "TP53",
    "EGFR",
    "MYC",
    "KRAS"
  ]
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "mode": "lookupSymbol",
    "species": "homo_sapiens",
    "symbols": [
        "BRCA2",
        "TP53",
        "EGFR",
        "MYC",
        "KRAS"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/ensembl-genomics-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "mode": "lookupSymbol",
    "species": "homo_sapiens",
    "symbols": [
        "BRCA2",
        "TP53",
        "EGFR",
        "MYC",
        "KRAS",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/ensembl-genomics-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "mode": "lookupSymbol",
  "species": "homo_sapiens",
  "symbols": [
    "BRCA2",
    "TP53",
    "EGFR",
    "MYC",
    "KRAS"
  ]
}' |
apify call parseforge/ensembl-genomics-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/ensembl-genomics-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Ensembl Genomics Scraper (Genes, Variants, Sequences)",
        "description": "Query the Ensembl genome reference for 200+ species. Look up genes by symbol or stable ID, list features in a genomic region, fetch DNA sequence, or resolve human variants (rsIDs). Returns biotype, coordinates, transcript IDs, descriptions, and assembly metadata.",
        "version": "1.0",
        "x-build-id": "fgvGpO7gYy3eHdfMT"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~ensembl-genomics-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-ensembl-genomics-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~ensembl-genomics-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-ensembl-genomics-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~ensembl-genomics-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-ensembl-genomics-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "lookupSymbol",
                            "lookupId",
                            "overlapRegion",
                            "sequence",
                            "variation"
                        ],
                        "type": "string",
                        "description": "What to query: lookup a gene by symbol, lookup by stable ID, list overlapping gene features in a chromosomal region, fetch DNA sequence by ID, or fetch a human variant by rsID."
                    },
                    "species": {
                        "title": "Species",
                        "enum": [
                            "homo_sapiens",
                            "mus_musculus",
                            "rattus_norvegicus",
                            "danio_rerio",
                            "drosophila_melanogaster",
                            "caenorhabditis_elegans",
                            "saccharomyces_cerevisiae",
                            "arabidopsis_thaliana",
                            "gallus_gallus",
                            "sus_scrofa",
                            "bos_taurus",
                            "canis_familiaris",
                            "felis_catus",
                            "equus_caballus",
                            "ovis_aries",
                            "macaca_mulatta",
                            "pan_troglodytes",
                            "xenopus_tropicalis",
                            "oryzias_latipes",
                            "anopheles_gambiae"
                        ],
                        "type": "string",
                        "description": "Reference species (Ensembl convention: genus_species). Used by lookupSymbol, overlapRegion, sequence modes. Ignored by lookupId and variation modes."
                    },
                    "symbols": {
                        "title": "Gene Symbols",
                        "type": "array",
                        "description": "Gene symbols for lookupSymbol mode (e.g. BRCA2, TP53, EGFR).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "stableIds": {
                        "title": "Stable IDs",
                        "type": "array",
                        "description": "Ensembl stable IDs for lookupId or sequence modes (e.g. ENSG00000139618, ENST00000380152).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "region": {
                        "title": "Region",
                        "type": "string",
                        "description": "Genomic region for overlapRegion mode, format `chromosome:start-end` (e.g. 7:140424943-140624564, X:1000000-2000000). Returns all gene features in that interval."
                    },
                    "rsids": {
                        "title": "Variant rsIDs",
                        "type": "array",
                        "description": "dbSNP rsIDs for variation mode (e.g. rs56116432, rs1042522). Human-only.",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
