CRAN R Packages Metadata Scraper avatar

CRAN R Packages Metadata Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store
CRAN R Packages Metadata Scraper

CRAN R Packages Metadata Scraper

Pull authoritative metadata for any R package from the official CRAN crandb database. Returns package name, title, version, authors, maintainer, license, dependencies, and publication date. Handy for dependency audits, license compliance checks, and cataloging R tooling.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

ParseForge Banner

📦 CRAN R Packages Scraper

🚀 Export R package metadata in seconds. Feed a list of package names and get clean, structured records straight from the official CRAN database.

🕒 Last updated: 2026-06-05 · 📊 18 fields per record · Powered by the CRAN crandb API · 20,000+ packages reachable

Pull authoritative metadata for any package on the Comprehensive R Archive Network (CRAN). This Actor queries the official crandb.r-pkg.org database, so every field comes directly from the package's own DESCRIPTION file. Hand it a list of package names and it returns the title, version, authors, maintainer, license, dependencies, and more for each one.

Coverage spans the entire CRAN registry. Popular packages like ggplot2, dplyr, and data.table resolve instantly, and any other published package name works the same way. Packages that do not exist on CRAN are reported as error records so you always know what resolved and what did not.

🎯 Target Audience💡 Primary Use Cases
R developers and data scientistsAudit dependencies and licenses across a package set
DevOps and security teamsTrack versions and maintainers for supply-chain checks
Researchers and educatorsBuild catalogs of R tooling for analysis or teaching
Open-source maintainersMonitor metadata of upstream and downstream packages

📋 What the CRAN R Packages Scraper does

  • Accepts a list of CRAN package names and fetches each one from the official crandb database.
  • Returns 18 structured fields per package, including dependencies parsed into readable lists.
  • Cleans the raw DESCRIPTION text so multi-line authors and descriptions become tidy single-line values.
  • Flags any name that is not found on CRAN as an error record instead of failing the run.

🎬 Full Demo (🚧 Coming soon)

⚙️ Input

FieldTypeDescription
packageNamesarray of stringsCRAN package names to look up. Defaults to nine popular packages.
maxItemsintegerCaps the number of packages processed. Free plan is limited to 10.

Example: default popular packages

{
"packageNames": ["ggplot2", "dplyr", "data.table", "shiny", "Rcpp", "stringr", "tidyr", "jsonlite", "httr"]
}

Example: a custom dependency audit

{
"packageNames": ["tidymodels", "recipes", "parsnip", "workflows"],
"maxItems": 50
}

⚠️ Good to Know: Package names on CRAN are case sensitive. Use Rcpp, not rcpp. Names that do not match a published CRAN package are returned as error records so the rest of the run still completes.

📊 Output

Each package becomes one record with the following fields.

FieldDescription
📦 packagePackage name
📌 titleOne-line package title
🔖 versionLatest published version
👥 authorsRStructured Authors@R block from the DESCRIPTION
✍️ authorPlain-text author credits
👤 maintainerMaintainer name and email
📝 descriptionFull package description
⚖️ licenseLicense string
🔗 urlProject or homepage URLs
🐞 bugReportsBug tracker URL
📅 publishedPublication date of the current version
🛠 needsCompilationWhether the package needs compilation
🔢 releaseCountNumber of historical releases known to crandb
📥 dependsDepends entries as a readable list
📦 importsImports entries as a readable list
💡 suggestsSuggests entries as a readable list
🕒 scrapedAtTimestamp of collection
errorPopulated only when a name fails to resolve

Real sample records:

{
"package": "ggplot2",
"title": "Create Elegant Data Visualisations Using the Grammar of Graphics",
"version": "4.0.3",
"maintainer": "Thomas Lin Pedersen <thomas.pedersen@posit.co>",
"license": "MIT + file LICENSE",
"url": "https://ggplot2.tidyverse.org, https://github.com/tidyverse/ggplot2",
"published": "2026-04-22 09:10:03 UTC",
"needsCompilation": "no",
"imports": ["cli", "grDevices", "grid"]
}
{
"package": "dplyr",
"title": "A Grammar of Data Manipulation",
"version": "1.2.1",
"maintainer": "Hadley Wickham <hadley@posit.co>",
"license": "MIT + file LICENSE",
"url": "https://dplyr.tidyverse.org, https://github.com/tidyverse/dplyr",
"published": "2026-04-03 07:30:08 UTC",
"needsCompilation": "yes",
"imports": ["cli (>= 3.6.2)", "generics", "glue (>= 1.3.2)"]
}
{
"package": "data.table",
"title": "Extension of `data.frame`",
"version": "1.18.4",
"maintainer": "Tyson Barrett <t.barrett88@gmail.com>",
"license": "MPL-2.0 | file LICENSE",
"url": "https://r-datatable.com, https://Rdatatable.gitlab.io/data.table",
"published": "2026-05-06 05:10:20 UTC",
"needsCompilation": "yes",
"imports": ["methods"]
}

✨ Why choose this Actor

  • Authoritative source. Data comes straight from the official CRAN crandb database, not a third-party mirror.
  • Clean dependencies. Depends, Imports, and Suggests arrive as readable lists, not raw blobs.
  • Resilient runs. Unknown package names become error records, so one bad name never sinks the batch.
  • Zero setup. No API key needed. Paste package names and run.

📈 How it compares to alternatives

ApproachSetupStructured fieldsDependency parsingBatch lookups
CRAN R Packages ScraperNone18 fieldsYesYes
Manual DESCRIPTION readingHighManualManualNo
available.packages() in RR environmentLimitedPartialOne mirror snapshot

🚀 How to use

  1. Sign up for a free Apify account using this link.
  2. Open the CRAN R Packages Scraper in the Apify Console.
  3. Enter the package names you want, or keep the default popular set.
  4. Click Start and let the Actor fetch each package.
  5. Browse the results table or pull them through the Apify API.

💼 Business use cases

Dependency and supply-chain auditing

NeedHow this helps
Map the dependency tree of an internal projectPull Depends and Imports for every package in scope
Spot heavy or risky dependenciesCompare import lists across packages at a glance

License compliance

NeedHow this helps
Verify licenses before shippingRead the license field for each package
Flag copyleft obligationsFilter on license strings like GPL or MPL

Maintenance and ownership tracking

NeedHow this helps
Know who maintains a dependencyCapture maintainer name and email
Track version driftCompare version and publication date over time

Research and cataloging

NeedHow this helps
Build a catalog of R tools in a domainBatch-fetch metadata for a curated package list
Study authorship patternsAnalyze the Authors@R and author fields

🔌 Automating CRAN R Packages Scraper

Connect this Actor to the tools you already use:

  • Make and Zapier to trigger runs and route results into other apps.
  • Slack to post a summary when a run finishes.
  • Airbyte to sync results into a warehouse.
  • GitHub Actions to refresh metadata on a schedule.
  • Google Drive to archive each run's output.

🌟 Beyond business use cases

  • Research. Compile metadata for a literature review of R tooling.
  • Personal. Track the packages you depend on in side projects.
  • Non-profit. Audit open-source licenses for community tools.
  • Experimentation. Prototype a package recommender from dependency graphs.

🤖 Ask an AI assistant

Drop the output into ChatGPT, Claude, Perplexity, or Microsoft Copilot and ask it to summarize licenses, cluster packages by dependency overlap, or draft an upgrade plan.

❓ Frequently Asked Questions

Where does the data come from? The official CRAN crandb database at crandb.r-pkg.org, which mirrors each package's DESCRIPTION file.

Do I need an API key? No. The source is keyless and the Actor needs no credentials.

Are package names case sensitive? Yes. CRAN treats Rcpp and rcpp differently, so match the exact published name.

What happens if a package does not exist? That name is returned as an error record and the rest of the run continues.

How many packages can I fetch? Free plans are capped at 10 items per run. Paid plans can fetch many more.

Does it return historical versions? It returns the current published version plus a count of known releases. Per-version history is not expanded.

Are dependencies included? Yes. Depends, Imports, and Suggests are parsed into readable lists.

Is the maintainer email included? Yes, when CRAN publishes it in the DESCRIPTION.

Can I schedule runs? Yes, through the Apify scheduler or any connected automation tool.

Is this affiliated with CRAN or the R Foundation? No. It is an independent tool that reads publicly available CRAN data.

🔌 Integrate with any app

Every run produces structured records you can pull through the Apify API or connect to Make, Zapier, n8n, and more.

  • ParseForge collection for more developer and data tooling scrapers.
  • Browse related package-registry and metadata Actors on the ParseForge profile.

💡 Pro Tip: browse the complete ParseForge collection.

🆘 Need Help? Open our contact form

⚠️ Disclaimer: independent tool, not affiliated with CRAN or the R Foundation. Only publicly available data collected.