EU CORDIS Grants Scraper
Pricing
Pay per event
EU CORDIS Grants Scraper
Scrape EU CORDIS — the European Commission's R&D project database. Get grants, participants, funding amounts, topics, and project IDs as typed dataset rows. We handle pagination, retries, and rate-limit pacing across CORDIS's federated endpoints.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
EU CORDIS Grants Scraper
We do the dirty work so your dataset stays clean. 😈
$3.05 / 1,000 rows — Export structured EU grant project records from the CORDIS database covering Horizon Europe (HORIZON) and Horizon 2020 (H2020) frameworks. Four input modes: direct project-ID lookup, free-text search, programme-code filter, and coordinator-country filter. Every row carries funding amounts, coordinator, participants, programme hierarchy, dates, status, and the project objective. Public CORDIS search API, no API key, no login, no browser automation.
CORDIS (Community Research and Development Information Service) is the EU's primary source for results of EU-funded research projects since FP1. The Horizon Europe (2021-2027, ~95 billion EUR budget) and Horizon 2020 (2014-2020, ~80 billion EUR closed) frameworks together expose 57,000+ project records. This Actor turns the awkward CORDIS web search into a clean JSON dataset for grant writers, research institutions, science-policy analysts, and EU innovation consultants. The only existing Apify Actor for CORDIS is marked DEPRECATED — there is zero active competition on the Store.
🎯 What this scrapes
One ResultRow per project. Every row carries the same 20 columns regardless of which input mode you used. Data is published by the European Commission under CC-BY 4.0 (EU Open Data policy).
| Field | Type | Description |
|---|---|---|
project_id | string | CORDIS numeric project ID |
project_acronym | string | null | Short acronym (e.g. Photo2Fuel) |
project_title | string | Full project title |
project_url | string | Public CORDIS project page URL |
programme | string | null | Primary programme code (e.g. HORIZON.2.5) |
programme_display_name | string | null | Programme title (e.g. Climate, Energy and Mobility) |
call_id | string | null | Master call identifier (e.g. HORIZON-CL5-2021-D2-01) |
funding_scheme | string | null | Funding scheme code (e.g. HORIZON-RIA, ERC-COG) |
start_date | string | null | Project start date (YYYY-MM-DD) |
end_date | string | null | Project end date (YYYY-MM-DD) |
total_cost_eur | number | null | Total project cost in EUR |
eu_contribution_eur | number | null | EU contribution (grant amount) in EUR |
status | string | null | SIGNED, CLOSED, or TERMINATED |
coordinator_organization | string | null | Legal name of the coordinator organisation |
coordinator_country | string | null | Coordinator country (ISO 3166-1 alpha-2) |
participating_organizations | string[] | Legal names of all participating organisations |
participating_countries | string[] | Deduped ISO 3166-1 alpha-2 codes of participants |
objective_summary | string | null | Project objective text (truncated unless includeFullAbstract=true) |
keywords | string[] | Project keywords parsed from the comma-separated CORDIS string |
scraped_at | string | ISO 8601 UTC datetime this row was written |
🔥 Features
- Four input modes in one Actor —
projectIds,searchQuery,programmeFilter,countryFilter. Pydantic XOR validator enforces exactly one mode before any network call. - Horizon Europe and Horizon 2020 both supported via the
frameworkswitch (HORIZON,H2020, orANY). - Coordinator-country filter implemented as a server-side scan with in-process post-filtering — CORDIS's search API does not expose coordinator country as a query field, so the Actor scans pages and emits only matching rows.
- Single-hit anomaly handled — CORDIS returns
hits.hitas a dict (not a list) when only one project matches; the parser's_ensure_list()normalises every nested array so the same code path handles both shapes. - Programme primary selection respects
@attributes.uniqueProgrammePart=true— multi-programme projects expose the correct top-level programme code, not the first nested one. - Optional full-abstract mode —
includeFullAbstract=truedeep-fetches the project detail HTML page via parsel and replaces the truncated search-API objective with the full text; charged asresult-row-detailed($0.005) instead ofresult-row($0.003). - Exponential backoff with
Retry-Afterhonoured for429and503responses; max 5 attempts. curl-cffiwith Chrome 131 TLS impersonation (ADR-0002 house default) — robust against any future CORDIS rate-limit tightening even though the endpoint is unauthenticated today.- Apify Proxy support via the
BUYPROXIES94952group (opt-in viauseProxy). - Pydantic v2 input + output models —
ResultRow.statusis validated against the live enum (SIGNED,CLOSED,TERMINATED);country_filternormalised to upper-case at validation.
💡 Use cases
- Grant writer competitive intelligence — pull all
HORIZON.2.5(Climate, Energy and Mobility) projects from the last 12 months to map who is winning which calls in your topic area. - Research-institution portfolio reporting — filter by
countryFilter=DE(or your country) to enumerate every Horizon Europe project coordinated nationally, with funding totals. - Science-policy analysis — bulk-export H2020 vs HORIZON funding distributions across funding schemes (RIA / IA / CSA / ERC) for a programme retrospective.
- EU innovation consulting — feed a list of
projectIds(e.g. from a client's reference list) and return clean structured records for proposal due diligence. - University tech-transfer offices — track CLOSED projects in their field where commercial follow-on opportunities (IP licensing, spinout candidates) may have emerged.
- Journalists & think-tanks — measure the gender, country, and SME representation among coordinators across an entire framework programme.
⚙️ How to use it
- Open the Actor input form on the Apify Console.
- Pick exactly one input mode:
- Project IDs — supply a list of
projectIds(e.g.["101069357"]) for direct lookup. - Search query — set
searchQueryto a free-text query (e.g."quantum computing"). - Programme filter — set
programmeFilterto a programme code (e.g."HORIZON.2.5"or"H2020-EU.1.1."). - Country filter — set
countryFilterto an ISO 3166-1 alpha-2 code (e.g."DE","ES","NL").
- Project IDs — supply a list of
- Pick a
framework—HORIZON(default),H2020, orANY. Ignored inprojectIdsmode. - Set
maxProjects(1-5000) to cap the dataset size in list modes. Ignored inprojectIdsmode. - Toggle
includeFullAbstracton if you need the full untruncated objective text — costs one extra request per row and switches PPE toresult-row-detailed. - Toggle
useProxyon if CORDIS starts returning429. Default is off — CORDIS does not currently rate-limit datacenter IPs. - Click Start. Results stream into the default dataset as JSON / CSV / Excel / XML.
Single project lookup
{"projectIds": ["101069357"]}
Free-text search, capped at 50 rows
{"searchQuery": "quantum computing","framework": "HORIZON","maxProjects": 50}
All German-coordinated HORIZON projects (post-filter)
{"countryFilter": "DE","framework": "HORIZON","maxProjects": 500}
📥 Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
projectIds | string[] | XOR | — | Explicit CORDIS project IDs to fetch. |
searchQuery | string | XOR | — | Free-text search query (1-500 chars). |
programmeFilter | string | XOR | — | Programme code (e.g. HORIZON.2.5). |
countryFilter | string | XOR | — | ISO 3166-1 alpha-2 of coordinator country (e.g. DE). |
framework | enum | no | HORIZON | HORIZON, H2020, or ANY. |
maxProjects | integer | no | 100 | Max ResultRows emitted (1-5000). |
includeFullAbstract | boolean | no | false | Deep-fetch full objective from detail page. |
useProxy | boolean | no | false | Route via Apify Proxy (BUYPROXIES94952). |
Exactly one of projectIds, searchQuery, programmeFilter, or countryFilter must be provided. Passing zero or two+ raises a validation error before any network call.
📤 Output
One row per project, pushed to the default dataset and available as JSON, CSV, Excel, or XML.
{"project_id": "101069357","project_acronym": "Photo2Fuel","project_title": "Artificial PHOTOsynthesis to produce FUELs and chemicals","project_url": "https://cordis.europa.eu/project/id/101069357/en","programme": "HORIZON.2.5","programme_display_name": "Climate, Energy and Mobility","call_id": "HORIZON-CL5-2021-D2-01","funding_scheme": "HORIZON-RIA","start_date": "2022-09-01","end_date": "2025-08-31","total_cost_eur": 2493171.25,"eu_contribution_eur": 2493171.0,"status": "SIGNED","coordinator_organization": "IDENER RESEARCH & DEVELOPMENT AIE","coordinator_country": "ES","participating_organizations": ["FUNDACION TECNALIA RESEARCH & INNOVATION", "UPPSALA UNIVERSITET"],"participating_countries": ["DE", "ES", "SE"],"objective_summary": "The Photo2Fuel project will develop a breakthrough technology that converts CO2 into useful fuels and chemicals...","keywords": ["solar energy", "bacteria", "archaea", "solar fuels", "CO2 reduction"],"scraped_at": "2026-05-16T12:00:00.000000+00:00"}
Export formats
- JSON — full fidelity, all 20 fields, newline-delimited
- CSV — flat, one row per project (array fields joined)
- Excel —
.xlsxvia the Apify dataset converter - XML — structured per-item
All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.
💰 Pricing
Pay-Per-Event (PPE) — you pay only for what you use:
| Event | Price (USD) | When |
|---|---|---|
actor-start | $0.05 | Once per run, at boot |
result-row | $0.003 | Per project row written when includeFullAbstract=false |
result-row-detailed | $0.005 | Per project row written when includeFullAbstract=true |
Example costs
| Run | Rows | Mode | Cost |
|---|---|---|---|
| 1 project lookup | 1 | standard | $0.053 |
| 50 search results | 50 | standard | $0.20 |
| 500 programme rows | 500 | standard | $1.55 |
| 1,000 country-filtered rows | 1,000 | standard | $3.05 |
| 1,000 rows, full abstracts | 1,000 | detailed | $5.05 |
At scale the per-row charge dominates: ~$3.05 per 1,000 rows in standard mode, ~$5.05 per 1,000 rows in detailed mode. Comparable NIH / NSF grant Actors run $1-3 per 1,000 records — EU grants are higher value per record due to larger deal sizes (median Horizon Europe grant ~2-3M EUR vs ~500k USD for typical NIH R01) and richer multi-organisation metadata.
🚧 Limitations
- Public CORDIS search API only. No authenticated CORDIS portal features, no My CORDIS saved searches, no Steamworks-style backstage endpoints.
- HORIZON and H2020 only. FP7 and earlier framework programmes are not indexed in the current CORDIS search; this Actor returns zero rows for FP6-and-earlier queries.
- No deliverables, publications, or result documents. Available on the project detail page but not in the search API; out of scope.
- No organisation-level enrichment. Legal address, VAT ID, org type category — out of scope; use a dedicated organisation enrichment Actor.
- Country filter is post-filter, not server-side. CORDIS's search API does not expose coordinator country as a query field, so country-filter mode may scan many more API pages than
maxProjectsto collect that many matching rows. API call count is uncapped; emitted row count is always≤ maxProjects. - Status filter is not a user input. Live CORDIS data only emits
SIGNED,CLOSED, orTERMINATED; the search API's status field path does not reliably filter — surfaced only on the output row. - Page size is fixed at 50. CORDIS's
num=100parameter returns only 10 results (verified 2026-05-16) — the Actor usesnum=50everywhere as the effective practical maximum. - 7-day default storage retention on the Apify FREE tier. Export your dataset immediately after the run or upgrade for longer retention.
- CORDIS data is CC-BY 4.0. Attribution to the European Commission's CORDIS service is required when republishing.
❓ FAQ
Do I need an API key?
No. The CORDIS search endpoint at https://cordis.europa.eu/search/en?format=json is fully public and unauthenticated. The Actor reads only data that the EU Commission already publishes openly.
Why is the objective_summary cut off?
The CORDIS search API returns the objective text truncated to ~2000 characters in search results. Set includeFullAbstract=true to make the Actor follow up with a per-project HTTP GET to the detail page (/project/id/{id}/en) and extract the untruncated objective via a parsel CSS selector. This switches the PPE event from result-row ($0.003) to result-row-detailed ($0.005) per row.
Why is the country filter so slow?
CORDIS's search API does not expose coordinator country as a query field — coordinator/country=DE returns zero results (verified 2026-05-16). The Actor implements country filtering by post-filtering full search results: it scans pages of all-framework projects and keeps only rows where coordinator_country == countryFilter. To collect maxProjects=500 German projects, the Actor may scan 1500+ projects total. The emitted dataset row count is always ≤ maxProjects.
Can I filter by project status (SIGNED / CLOSED / TERMINATED)?
Not as a direct input — CORDIS's search API status field path does not produce reliable filtered results. Status is emitted on every output row, so you can filter the dataset client-side after the run.
What about FP7 or earlier frameworks?
Out of scope. The current CORDIS search index covers HORIZON (Horizon Europe, 2021-2027) and H2020 (Horizon 2020, 2014-2020) only. The Actor will return zero rows for any FP6-and-earlier query.
Are CORDIS results free to redistribute?
Yes, under CC-BY 4.0 (EU Open Data policy). Attribution to the European Commission's CORDIS service is required.
Related Actors
Part of the Devil Scrapes Research Intelligence Suite:
- arXiv Papers Scraper — preprint metadata across all arXiv categories.
- PubMed Papers Scraper — biomedical literature from the NIH PubMed index.
- SEC EDGAR Filings Scraper — US public-company filings for company funding context.
- USPTO Patents Scraper — US patent metadata for IP landscape work.
- Hugging Face Hub Scraper — model and dataset metadata for AI research overlap.
All suite Actors share consistent PPE pricing ($0.001-$0.005 per row, $0.01-$0.05 per run) and scraped_at ISO 8601 UTC timestamps so cross-source joins work cleanly.
💬 Your feedback
Found a bug, hit a rate limit, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.