EU CORDIS API — Horizon Europe & H2020 Scraper
Pricing
Pay per event
EU CORDIS API — Horizon Europe & H2020 Scraper
EU CORDIS API scraper for Horizon Europe and Horizon 2020 grant projects. Exports funding amounts, coordinator, participants, programme, dates, and objectives as structured JSON. Four input modes, 20 fields per row.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
an hour ago
Last modified
Categories
Share
EU CORDIS API — Horizon Europe & H2020 Scraper
We do the dirty work so your dataset stays clean. 😈
$3.05 / 1,000 rows — pay only for results that land. No credit card required to try.
Export structured EU grant project records from the CORDIS database covering Horizon Europe (HORIZON) and Horizon 2020 (H2020) frameworks. Four input modes: direct project-ID lookup, free-text search, programme-code filter, and coordinator-country filter. Every row carries funding amounts, coordinator, participants, programme hierarchy, dates, status, and the project objective.
CORDIS (Community Research and Development Information Service) is the EU's primary source for results of EU-funded research projects since FP1. Horizon Europe (2021–2027, ~€95 billion budget) and Horizon 2020 (2014–2020, ~€80 billion, closed) together expose 57,000+ project records. We turn the awkward CORDIS web search into a clean JSON dataset for grant writers, research institutions, science-policy analysts, and EU innovation consultants. The only existing Apify Actor for CORDIS is marked DEPRECATED — there is zero active competition on the Store.
🎯 What this scrapes
One ResultRow per project. Every row carries the same 20 columns regardless of which input mode you used. Data is published by the European Commission under CC-BY 4.0 (EU Open Data policy).
| Field | Type | Description |
|---|---|---|
project_id | string | CORDIS numeric project ID |
project_acronym | string | null | Short acronym (e.g. Photo2Fuel) |
project_title | string | Full project title |
project_url | string | Public CORDIS project page URL |
programme | string | null | Primary programme code (e.g. HORIZON.2.5) |
programme_display_name | string | null | Programme title (e.g. Climate, Energy and Mobility) |
call_id | string | null | Master call identifier (e.g. HORIZON-CL5-2021-D2-01) |
funding_scheme | string | null | Funding scheme code (e.g. HORIZON-RIA, ERC-COG) |
start_date | string | null | Project start date (YYYY-MM-DD) |
end_date | string | null | Project end date (YYYY-MM-DD) |
total_cost_eur | number | null | Total project cost in EUR |
eu_contribution_eur | number | null | EU contribution (grant amount) in EUR |
status | string | null | SIGNED, CLOSED, or TERMINATED |
coordinator_organization | string | null | Legal name of the coordinator organisation |
coordinator_country | string | null | Coordinator country (ISO 3166-1 alpha-2) |
participating_organizations | string[] | Legal names of all participating organisations |
participating_countries | string[] | Deduped ISO 3166-1 alpha-2 codes of participants |
objective_summary | string | null | Project objective text (truncated unless includeFullAbstract=true) |
keywords | string[] | Project keywords parsed from the comma-separated CORDIS string |
scraped_at | string | ISO 8601 UTC datetime this row was written |
🔥 Features
- Four input modes in one Actor —
projectIds,searchQuery,programmeFilter,countryFilter. Pydantic XOR validator enforces exactly one mode before any network call. - Horizon Europe and Horizon 2020 both supported via the
frameworkswitch (HORIZON,H2020, orANY). - Coordinator-country filter implemented as a server-side scan with in-process post-filtering — CORDIS's search API does not expose coordinator country as a query field, so the Actor scans pages and emits only matching rows.
- Single-hit anomaly handled — CORDIS returns
hits.hitas a dict (not a list) when only one project matches; the parser's_ensure_list()normalises every nested array so the same code path handles both shapes. - Programme primary selection respects
@attributes.uniqueProgrammePart=true— multi-programme projects expose the correct top-level programme code, not the first nested one. - Optional full-abstract mode —
includeFullAbstract=truedeep-fetches the project detail HTML page via parsel and replaces the truncated search-API objective with the full text; charged asresult-row-detailed($0.005) instead ofresult-row($0.003). - Pydantic v2 input + output models —
ResultRow.statusis validated against the live enum (SIGNED,CLOSED,TERMINATED);country_filternormalised to upper-case at validation.
What we handle for you
CORDIS's public endpoints can rate-limit and shape traffic in ways that break naive scrapers. We absorb every failure mode before it touches your dataset:
- Browser fingerprint rotation —
curl-cffireplays real-browser TLS handshakes (Chrome 131 impersonation) so the target sees a genuine browser client, not Python. Profiles rotate across requests on any target that fingerprints clients. - Residential proxy rotation — Apify Proxy routes requests through fresh exit IPs on every block. A
4xxor5xxresponse invalidates the current session; we request a newsession_idand retry automatically. - Exponential backoff with
Retry-Afterhonoured —429and503responses trigger retries with 2 s base, doubling up to 30 s, max 5 attempts. We never hammer the endpoint; partial successes surface with a clear status message rather than silently returning an empty dataset. - Rate-limit pacing — page scanning for country-filter mode is paced so bursts don't trigger throttling mid-run.
- Clean typed rows — Pydantic-validated output, ISO-8601 timestamps, stable IDs. No raw dicts, no silent nulls.
- Pay only for results that land — PPE pricing means no charge for rows that never emit. Only the small
actor-startwarm-up fee applies if a run returns zero rows.
💡 Use cases
- Grant writer competitive intelligence — pull all
HORIZON.2.5(Climate, Energy and Mobility) projects from the last 12 months to map who is winning which calls in your topic area. - Research-institution portfolio reporting — filter by
countryFilter=DE(or your country) to enumerate every Horizon Europe project coordinated nationally, with funding totals. - Science-policy analysis — bulk-export H2020 vs HORIZON funding distributions across funding schemes (RIA / IA / CSA / ERC) for a programme retrospective.
- EU innovation consulting — feed a list of
projectIds(e.g. from a client's reference list) and return clean structured records for proposal due diligence. - University tech-transfer offices — track CLOSED projects in their field where commercial follow-on opportunities (IP licensing, spinout candidates) may have emerged.
- Journalists & think-tanks — measure the country, programme, and consortium composition across an entire framework programme using the
horizon europe project datareturned per row. - B2B SaaS prospecting — use
coordinator_organization+coordinator_countryto build a list of EU research-project leaders as a prospecting signal for tools that sell into that audience.
⚙️ How to use it
- Open the Actor input form on the Apify Console.
- Pick exactly one input mode:
- Project IDs — supply a list of
projectIds(e.g.["101069357"]) for direct lookup. - Search query — set
searchQueryto a free-text query (e.g."quantum computing"). - Programme filter — set
programmeFilterto a programme code (e.g."HORIZON.2.5"or"H2020-EU.1.1."). - Country filter — set
countryFilterto an ISO 3166-1 alpha-2 code (e.g."DE","ES","NL").
- Project IDs — supply a list of
- Pick a
framework—HORIZON(default),H2020, orANY. Ignored inprojectIdsmode. - Set
maxProjects(1–5000) to cap the dataset size in list modes. Ignored inprojectIdsmode. - Toggle
includeFullAbstracton if you need the full untruncated objective text — costs one extra request per row and switches PPE toresult-row-detailed. - Toggle
useProxyon to route through Apify residential proxies. Recommended for large runs or if you observe429responses. - Click Start. Results stream into the default dataset and are available as JSON, CSV, Excel, or XML.
Single project lookup
{"projectIds": ["101069357"]}
Free-text search, capped at 50 rows
{"searchQuery": "quantum computing","framework": "HORIZON","maxProjects": 50}
All German-coordinated HORIZON projects (post-filter)
{"countryFilter": "DE","framework": "HORIZON","maxProjects": 500}
📥 Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
projectIds | string[] | XOR | — | Explicit CORDIS project IDs to fetch. |
searchQuery | string | XOR | — | Free-text search query (1–500 chars). |
programmeFilter | string | XOR | — | Programme code (e.g. HORIZON.2.5). |
countryFilter | string | XOR | — | ISO 3166-1 alpha-2 of coordinator country (e.g. DE). |
framework | enum | no | HORIZON | HORIZON, H2020, or ANY. |
maxProjects | integer | no | 100 | Max ResultRows emitted (1–5000). |
includeFullAbstract | boolean | no | false | Deep-fetch full objective from detail page. |
useProxy | boolean | no | false | Route via Apify Proxy residential IPs. |
Exactly one of projectIds, searchQuery, programmeFilter, or countryFilter must be provided. Passing zero or two+ raises a validation error before any network call.
📤 Output
One row per project, pushed to the default dataset and available as JSON, CSV, Excel, or XML.
{"project_id": "101069357","project_acronym": "Photo2Fuel","project_title": "Artificial PHOTOsynthesis to produce FUELs and chemicals","project_url": "https://cordis.europa.eu/project/id/101069357/en","programme": "HORIZON.2.5","programme_display_name": "Climate, Energy and Mobility","call_id": "HORIZON-CL5-2021-D2-01","funding_scheme": "HORIZON-RIA","start_date": "2022-09-01","end_date": "2025-08-31","total_cost_eur": 2493171.25,"eu_contribution_eur": 2493171.0,"status": "SIGNED","coordinator_organization": "IDENER RESEARCH & DEVELOPMENT AIE","coordinator_country": "ES","participating_organizations": ["FUNDACION TECNALIA RESEARCH & INNOVATION", "UPPSALA UNIVERSITET"],"participating_countries": ["DE", "ES", "SE"],"objective_summary": "The Photo2Fuel project will develop a breakthrough technology that converts CO2 into useful fuels and chemicals...","keywords": ["solar energy", "bacteria", "archaea", "solar fuels", "CO2 reduction"],"scraped_at": "2026-05-16T12:00:00.000000+00:00"}
Export formats
- JSON — full fidelity, all 20 fields, newline-delimited
- CSV — flat, one row per project (array fields joined)
- Excel —
.xlsxvia the Apify dataset converter - XML — structured per-item
All formats are available via the Apify dataset API: GET /datasets/{id}/items?format=csv&clean=true.
💰 Pricing
Pay-Per-Event (PPE) — you pay only for what you use:
| Event | Price (USD) | When |
|---|---|---|
actor-start | $0.05 | Once per run, at boot |
result-row | $0.003 | Per project row written when includeFullAbstract=false |
result-row-detailed | $0.005 | Per project row written when includeFullAbstract=true |
Example costs
| Run | Rows | Mode | Cost |
|---|---|---|---|
| 1 project lookup | 1 | standard | $0.053 |
| 50 search results | 50 | standard | $0.20 |
| 500 programme rows | 500 | standard | $1.55 |
| 1,000 country-filtered rows | 1,000 | standard | $3.05 |
| 1,000 rows, full abstracts | 1,000 | detailed | $5.05 |
At scale the per-row charge dominates: ~$3.05 per 1,000 rows in standard mode, ~$5.05 per 1,000 rows in detailed mode. Comparable NIH / NSF grant Actors on Apify run $1–3 per 1,000 records. EU grant records carry more value per row — larger deal sizes (median Horizon Europe grant ~€2–3M vs ~$500k for a typical NIH R01) and richer multi-organisation metadata with consortium members across multiple countries.
🚧 Limitations
- HORIZON and H2020 only. FP7 and earlier framework programmes are not indexed in the current CORDIS search index; this Actor returns zero rows for FP6-and-earlier queries.
- No deliverables, publications, or result documents. Available on the project detail page but not in the search API; out of scope for this Actor.
- No organisation-level enrichment. Legal address, VAT ID, org type category — out of scope; use a dedicated organisation enrichment Actor.
- Country filter is post-filter, not server-side. CORDIS's search API does not expose coordinator country as a query field. Country-filter mode scans pages and keeps only rows where
coordinator_countrymatches. To collectmaxProjects=500German projects, the Actor may scan 1,500+ projects total. The emitted row count is always≤ maxProjects. - Status filter is not a user input. The search API's status field path does not reliably filter server-side — status is emitted on every output row so you can filter the dataset client-side after the run.
- Page size is fixed at 50. CORDIS's
num=100parameter returns only 10 results (verified 2026-05-16) — the Actor usesnum=50everywhere as the effective practical maximum. - 7-day default storage retention on the Apify FREE tier. Export your dataset immediately after the run or upgrade for longer retention.
- CORDIS data is CC-BY 4.0. Attribution to the European Commission's CORDIS service is required when republishing.
❓ FAQ
What is the EU CORDIS API and what does this Actor do with it?
CORDIS exposes a public JSON search endpoint at https://cordis.europa.eu/search/en?format=json. This Actor wraps that endpoint with four input modes (project ID lookup, free-text search, programme filter, coordinator-country filter), handles pagination, normalises the raw API response into clean typed rows, and streams results into an Apify dataset in JSON, CSV, Excel, or XML. No registration or API key is required.
Does this cover Horizon Europe and Horizon 2020?
Yes — both frameworks are supported via the framework input field (HORIZON, H2020, or ANY). The CORDIS search index covers Horizon Europe (2021–2027) and H2020 (2014–2020). FP7 and earlier frameworks are not indexed and will return zero results.
Why is the objective_summary cut off?
The CORDIS search API returns the objective text truncated to ~2,000 characters in search results. Set includeFullAbstract=true to make the Actor follow up with a per-project GET to the detail page (/project/id/{id}/en) and extract the untruncated objective via a CSS selector. This switches the PPE event from result-row ($0.003) to result-row-detailed ($0.005) per row.
Why is country-filter mode slow for large result sets?
CORDIS's search API does not expose coordinator country as a query parameter — coordinator/country=DE returns zero results (verified 2026-05-16). We implement country filtering by scanning all-framework results and retaining only matching rows. To collect maxProjects=500 German-coordinated projects, the Actor may scan 1,500+ projects. The emitted dataset row count is always ≤ maxProjects.
Can I filter by project status (SIGNED / CLOSED / TERMINATED)?
Not as a direct input — the CORDIS search API status field path does not produce reliable server-side filtered results. Status is emitted on every output row, so you can filter the dataset client-side after the run.
What about horizon europe grants scraper alternatives?
As of 2026, the only other Apify Actor for CORDIS data is marked DEPRECATED. This Actor is the only actively maintained option on the Store — four input modes, 20 structured fields per row, and PPE pricing so you pay only for rows that land.
Are CORDIS results free to redistribute?
Yes, under CC-BY 4.0 (EU Open Data policy). Attribution to the European Commission's CORDIS service is required when republishing.
How much horizon europe project data can I pull per run?
The maxProjects input caps runs at 1–5,000 rows. For programme-filter and search-query modes, results are paginated at 50 per page — a 5,000-row run makes ~100 API pages. Country-filter mode may scan significantly more pages to collect matching rows (see the Limitations section above).
💬 Your feedback
Found a bug, hit a rate limit, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.
Part of the Devil Scrapes Research Intelligence Suite — arXiv Papers Scraper, PubMed Papers Scraper, Hugging Face Hub Scraper. All suite Actors share consistent PPE pricing and ISO 8601 UTC scraped_at timestamps for clean cross-source joins.