# HPO Human Phenotype Ontology Scraper (`parseforge/hpo-phenotypes-scraper`) Actor

Search the Human Phenotype Ontology by keyword or HP ID and pull back terms with hpoId, name, definition, synonyms, parents, children, plus optional associated diseases and genes. Useful for rare disease research, clinical curation, and genomic variant annotation pipelines.

- **URL**: https://apify.com/parseforge/hpo-phenotypes-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Education, Automation, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🧠 HPO Phenotypes Scraper

> 🚀 **Export Human Phenotype Ontology terms in seconds. HPO ID, name, definition, synonyms, parents, children, associated diseases, and genes.**

> 🕒 **Last updated:** 2026-06-05 · **📊 10 fields** per record · Jackson Lab HPO API · 17,000+ phenotype terms · Real-time

The HPO Phenotypes Scraper turns the public Jackson Laboratory HPO API into a flat dataset of phenotype terms. Search by keyword or HPO ID, and optionally enrich each term with its associated diseases and causal genes.

The Human Phenotype Ontology is the standard vocabulary of phenotypic abnormalities used in clinical genetics worldwide.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 🧬 Clinical geneticists | Map patient phenotypes to candidate genes |
| 🔬 Rare disease researchers | Build phenotype profiles for cohorts |
| 🩺 Diagnostic teams | Enrich HPO term picklists |
| 🤖 Bioinformaticians | Mirror HPO into local pipelines |
| 📊 Pharma R&D | Track phenotypes linked to drug targets |
| 👩‍💻 Developers | Skip the OBO file parsing |

### 📋 What the HPO Phenotypes Scraper does

- Searches the HPO term catalog by keyword or resolves a single HPO ID.
- For each term, returns name, definition, synonyms, parent and child terms.
- Optionally enriches with associated OMIM / Orphanet diseases and causal genes.
- Flattens nested arrays to delimited strings for spreadsheet imports.
- Returns clean error rows on empty searches.
- Exports to your preferred dataset format.

> 💡 **Why it matters:** Most HPO consumers download the entire OBO file, parse it themselves, and then build a search index. This actor lets you query HPO directly with one click.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>searchTerm</code></td><td>string</td><td>No</td><td>Keyword or HPO ID like HP:0001250.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free 10, paid up to 1,000,000.</td></tr>
<tr><td><code>includeAssociations</code></td><td>boolean</td><td>No</td><td>Fetch associated diseases and genes per term.</td></tr>
</table>

**Example 1, search for 'seizure'**
```json
{ "searchTerm": "seizure", "maxItems": 20, "includeAssociations": true }
````

**Example 2, lookup a single HPO ID**

```json
{ "searchTerm": "HP:0001250" }
```

> ⚠️ **Good to Know:** Disease and gene enrichment adds one extra request per term. Disable it for fast catalog dumps.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🆔 `hpoId` | string | HPO term identifier. |
| 🧠 `name` | string | Term name. |
| 📖 `definition` | string | Formal definition. |
| 🔄 `synonyms` | string | Exact and related synonyms. |
| ⬆️ `parents` | string | Parent term IDs. |
| ⬇️ `children` | string | Child term IDs. |
| 🩺 `associatedDiseases` | string | OMIM / Orphanet disease names. |
| 🧬 `genes` | string | Associated gene symbols. |
| 🕒 `scrapedAt` | string | Fetch timestamp. |
| ❌ `error` | string | Error message if any. |

```json
{
  "hpoId": "HP:0001250",
  "name": "Seizure",
  "definition": "A seizure is an intermittent abnormality of nervous system physiology...",
  "synonyms": "Seizures; Epileptic seizure",
  "parents": "HP:0012638",
  "children": "HP:0002197; HP:0002353",
  "associatedDiseases": "Dravet syndrome; Tuberous sclerosis",
  "genes": "SCN1A, TSC1, TSC2",
  "scrapedAt": "2026-06-05T12:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Public Jackson Lab API, no key required. |
| 🧹 | Flat rows, no OBO parsing. |
| 🎯 | Optional disease and gene enrichment per term. |
| 🔌 | One actor, the full HPO. |
| 💾 | Push to dataset, instant export. |

### 📈 How it compares to alternatives

| Approach | Setup | Search | Enrichment |
|---|---|---|---|
| Download OBO and parse | Hours | Build your own index | Manual joins |
| HPO API + custom client | 30 min | Yes | Manual |
| **This Actor** | 5 sec | Built-in | Built-in |

### 🚀 How to use

1. Click **Try for free**.
2. Type a phenotype keyword or HPO ID.
3. Toggle disease and gene enrichment.
4. Click **Start**.

### 💼 Business use cases

**🧬 Phenotype-driven diagnosis.** Pull HPO terms with the right disease links for clinical decision support.

**📊 Cohort building.** Build phenotype profiles for rare disease cohorts.

**🤖 Curation tools.** Mirror HPO into your internal annotation UI.

**📈 Drug-target research.** Map phenotypes to target genes.

### 🔌 Automating HPO Phenotypes Scraper

- **Make / Zapier** trigger and push to Sheets or a database.
- **Cron** scheduler via Apify.
- **Webhooks** on run completion.
- **Pipe to BigQuery / Snowflake / Postgres** via integrations.

### 🌟 Beyond business use cases

**🎓 Teaching.** Walk medical students through phenotype hierarchies.

**🧪 Personal research.** Build a phenotype tracker for a rare condition.

**🤝 Open science.** Public phenotype-disease maps.

**🧰 Prototyping.** Add HPO autocomplete to a new tool in minutes.

### 🤖 Ask an AI assistant about this scraper

Paste this README into your assistant and describe your phenotype workflow.

### ❓ Frequently Asked Questions

**❓ Do I need an API key?** No.

**❓ Can I look up a single ID?** Yes, paste it in `searchTerm`.

**❓ How big is HPO?** 17,000+ terms.

**❓ What's the rate limit?** Be reasonable, the API is public.

**❓ Are arrays flattened?** Yes, semicolon-joined.

**❓ Can I skip the enrichment?** Yes, toggle off.

**❓ Can I schedule runs?** Yes.

**❓ Is this scraping?** API only.

**❓ Will the schema change?** Core fields stable.

**❓ Download format?** Any format Apify supports.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST or webhook endpoint.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge ClinVar Variants Scraper](https://apify.com/parseforge/clinvar-variants-scraper) | Variant interpretations. |
| [ParseForge dbSNP Variants Scraper](https://apify.com/parseforge/dbsnp-variants-scraper) | NCBI dbSNP variants. |
| [ParseForge Disease Ontology Scraper](https://apify.com/parseforge/disease-ontology-terms-scraper) | Disease Ontology terms. |
| [ParseForge NIH Reporter Grants Scraper](https://apify.com/parseforge/nih-reporter-grants-scraper) | NIH funded grants. |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer:** This actor scrapes only publicly available data. ParseForge is not affiliated with Jackson Laboratory or HPO. Users are responsible for complying with the target site's terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `searchTerm` (type: `string`):

Phenotype keyword (e.g. 'seizure', 'short stature') or specific HPO ID (e.g. HP:0001250).

## `maxItems` (type: `integer`):

Free users limited to 10 items. Paid users up to 1,000,000.

## `includeAssociations` (type: `boolean`):

Fetch associated diseases and genes per term. Slower.

## Actor input object example

```json
{
  "searchTerm": "seizure",
  "maxItems": 10,
  "includeAssociations": true
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchTerm": "seizure",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/hpo-phenotypes-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchTerm": "seizure",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/hpo-phenotypes-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchTerm": "seizure",
  "maxItems": 10
}' |
apify call parseforge/hpo-phenotypes-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/hpo-phenotypes-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "HPO Human Phenotype Ontology Scraper",
        "description": "Search the Human Phenotype Ontology by keyword or HP ID and pull back terms with hpoId, name, definition, synonyms, parents, children, plus optional associated diseases and genes. Useful for rare disease research, clinical curation, and genomic variant annotation pipelines.",
        "version": "0.1",
        "x-build-id": "AB8Yhg0h5d7iz5WA8"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~hpo-phenotypes-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-hpo-phenotypes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~hpo-phenotypes-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-hpo-phenotypes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~hpo-phenotypes-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-hpo-phenotypes-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchTerm": {
                        "title": "Search term or HPO ID",
                        "type": "string",
                        "description": "Phenotype keyword (e.g. 'seizure', 'short stature') or specific HPO ID (e.g. HP:0001250)."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users limited to 10 items. Paid users up to 1,000,000."
                    },
                    "includeAssociations": {
                        "title": "Include disease and gene associations",
                        "type": "boolean",
                        "description": "Fetch associated diseases and genes per term. Slower.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
