CRAN R Packages Metadata Scraper
Pricing
from $2.00 / 1,000 results
CRAN R Packages Metadata Scraper
Pull authoritative metadata for any R package from the official CRAN crandb database. Returns package name, title, version, authors, maintainer, license, dependencies, and publication date. Handy for dependency audits, license compliance checks, and cataloging R tooling.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share

📦 CRAN R Packages Scraper
🚀 Export R package metadata in seconds. Feed a list of package names and get clean, structured records straight from the official CRAN database.
🕒 Last updated: 2026-06-05 · 📊 18 fields per record · Powered by the CRAN crandb API · 20,000+ packages reachable
Pull authoritative metadata for any package on the Comprehensive R Archive Network (CRAN). This Actor queries the official crandb.r-pkg.org database, so every field comes directly from the package's own DESCRIPTION file. Hand it a list of package names and it returns the title, version, authors, maintainer, license, dependencies, and more for each one.
Coverage spans the entire CRAN registry. Popular packages like ggplot2, dplyr, and data.table resolve instantly, and any other published package name works the same way. Packages that do not exist on CRAN are reported as error records so you always know what resolved and what did not.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| R developers and data scientists | Audit dependencies and licenses across a package set |
| DevOps and security teams | Track versions and maintainers for supply-chain checks |
| Researchers and educators | Build catalogs of R tooling for analysis or teaching |
| Open-source maintainers | Monitor metadata of upstream and downstream packages |
📋 What the CRAN R Packages Scraper does
- Accepts a list of CRAN package names and fetches each one from the official crandb database.
- Returns 18 structured fields per package, including dependencies parsed into readable lists.
- Cleans the raw
DESCRIPTIONtext so multi-line authors and descriptions become tidy single-line values. - Flags any name that is not found on CRAN as an error record instead of failing the run.
🎬 Full Demo (🚧 Coming soon)
⚙️ Input
| Field | Type | Description |
|---|---|---|
packageNames | array of strings | CRAN package names to look up. Defaults to nine popular packages. |
maxItems | integer | Caps the number of packages processed. Free plan is limited to 10. |
Example: default popular packages
{"packageNames": ["ggplot2", "dplyr", "data.table", "shiny", "Rcpp", "stringr", "tidyr", "jsonlite", "httr"]}
Example: a custom dependency audit
{"packageNames": ["tidymodels", "recipes", "parsnip", "workflows"],"maxItems": 50}
⚠️ Good to Know: Package names on CRAN are case sensitive. Use
Rcpp, notrcpp. Names that do not match a published CRAN package are returned as error records so the rest of the run still completes.
📊 Output
Each package becomes one record with the following fields.
| Field | Description |
|---|---|
📦 package | Package name |
📌 title | One-line package title |
🔖 version | Latest published version |
👥 authorsR | Structured Authors@R block from the DESCRIPTION |
✍️ author | Plain-text author credits |
👤 maintainer | Maintainer name and email |
📝 description | Full package description |
⚖️ license | License string |
🔗 url | Project or homepage URLs |
🐞 bugReports | Bug tracker URL |
📅 published | Publication date of the current version |
🛠 needsCompilation | Whether the package needs compilation |
🔢 releaseCount | Number of historical releases known to crandb |
📥 depends | Depends entries as a readable list |
📦 imports | Imports entries as a readable list |
💡 suggests | Suggests entries as a readable list |
🕒 scrapedAt | Timestamp of collection |
❌ error | Populated only when a name fails to resolve |
Real sample records:
{"package": "ggplot2","title": "Create Elegant Data Visualisations Using the Grammar of Graphics","version": "4.0.3","maintainer": "Thomas Lin Pedersen <thomas.pedersen@posit.co>","license": "MIT + file LICENSE","url": "https://ggplot2.tidyverse.org, https://github.com/tidyverse/ggplot2","published": "2026-04-22 09:10:03 UTC","needsCompilation": "no","imports": ["cli", "grDevices", "grid"]}
{"package": "dplyr","title": "A Grammar of Data Manipulation","version": "1.2.1","maintainer": "Hadley Wickham <hadley@posit.co>","license": "MIT + file LICENSE","url": "https://dplyr.tidyverse.org, https://github.com/tidyverse/dplyr","published": "2026-04-03 07:30:08 UTC","needsCompilation": "yes","imports": ["cli (>= 3.6.2)", "generics", "glue (>= 1.3.2)"]}
{"package": "data.table","title": "Extension of `data.frame`","version": "1.18.4","maintainer": "Tyson Barrett <t.barrett88@gmail.com>","license": "MPL-2.0 | file LICENSE","url": "https://r-datatable.com, https://Rdatatable.gitlab.io/data.table","published": "2026-05-06 05:10:20 UTC","needsCompilation": "yes","imports": ["methods"]}
✨ Why choose this Actor
- Authoritative source. Data comes straight from the official CRAN crandb database, not a third-party mirror.
- Clean dependencies. Depends, Imports, and Suggests arrive as readable lists, not raw blobs.
- Resilient runs. Unknown package names become error records, so one bad name never sinks the batch.
- Zero setup. No API key needed. Paste package names and run.
📈 How it compares to alternatives
| Approach | Setup | Structured fields | Dependency parsing | Batch lookups |
|---|---|---|---|---|
| CRAN R Packages Scraper | None | 18 fields | Yes | Yes |
| Manual DESCRIPTION reading | High | Manual | Manual | No |
available.packages() in R | R environment | Limited | Partial | One mirror snapshot |
🚀 How to use
- Sign up for a free Apify account using this link.
- Open the CRAN R Packages Scraper in the Apify Console.
- Enter the package names you want, or keep the default popular set.
- Click Start and let the Actor fetch each package.
- Browse the results table or pull them through the Apify API.
💼 Business use cases
Dependency and supply-chain auditing
| Need | How this helps |
|---|---|
| Map the dependency tree of an internal project | Pull Depends and Imports for every package in scope |
| Spot heavy or risky dependencies | Compare import lists across packages at a glance |
License compliance
| Need | How this helps |
|---|---|
| Verify licenses before shipping | Read the license field for each package |
| Flag copyleft obligations | Filter on license strings like GPL or MPL |
Maintenance and ownership tracking
| Need | How this helps |
|---|---|
| Know who maintains a dependency | Capture maintainer name and email |
| Track version drift | Compare version and publication date over time |
Research and cataloging
| Need | How this helps |
|---|---|
| Build a catalog of R tools in a domain | Batch-fetch metadata for a curated package list |
| Study authorship patterns | Analyze the Authors@R and author fields |
🔌 Automating CRAN R Packages Scraper
Connect this Actor to the tools you already use:
- Make and Zapier to trigger runs and route results into other apps.
- Slack to post a summary when a run finishes.
- Airbyte to sync results into a warehouse.
- GitHub Actions to refresh metadata on a schedule.
- Google Drive to archive each run's output.
🌟 Beyond business use cases
- Research. Compile metadata for a literature review of R tooling.
- Personal. Track the packages you depend on in side projects.
- Non-profit. Audit open-source licenses for community tools.
- Experimentation. Prototype a package recommender from dependency graphs.
🤖 Ask an AI assistant
Drop the output into ChatGPT, Claude, Perplexity, or Microsoft Copilot and ask it to summarize licenses, cluster packages by dependency overlap, or draft an upgrade plan.
❓ Frequently Asked Questions
Where does the data come from? The official CRAN crandb database at crandb.r-pkg.org, which mirrors each package's DESCRIPTION file.
Do I need an API key? No. The source is keyless and the Actor needs no credentials.
Are package names case sensitive? Yes. CRAN treats Rcpp and rcpp differently, so match the exact published name.
What happens if a package does not exist? That name is returned as an error record and the rest of the run continues.
How many packages can I fetch? Free plans are capped at 10 items per run. Paid plans can fetch many more.
Does it return historical versions? It returns the current published version plus a count of known releases. Per-version history is not expanded.
Are dependencies included? Yes. Depends, Imports, and Suggests are parsed into readable lists.
Is the maintainer email included? Yes, when CRAN publishes it in the DESCRIPTION.
Can I schedule runs? Yes, through the Apify scheduler or any connected automation tool.
Is this affiliated with CRAN or the R Foundation? No. It is an independent tool that reads publicly available CRAN data.
🔌 Integrate with any app
Every run produces structured records you can pull through the Apify API or connect to Make, Zapier, n8n, and more.
🔗 Recommended Actors
- ParseForge collection for more developer and data tooling scrapers.
- Browse related package-registry and metadata Actors on the ParseForge profile.
💡 Pro Tip: browse the complete ParseForge collection.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: independent tool, not affiliated with CRAN or the R Foundation. Only publicly available data collected.