R Documentation Scraper avatar

R Documentation Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
R Documentation Scraper

R Documentation Scraper

Scrape R package metadata from CRAN via the public crandb API. Search packages by keyword or fetch specific packages by name - returns version, title, description, author, maintainer, license, dependencies, and rdocumentation.org links.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Scrape R package metadata from RDocumentation.org / CRAN using the public crandb API. No authentication or proxies required.

What it does

  • Search packages by keyword — finds all CRAN packages matching your query
  • Fetch specific packages by exact CRAN package name
  • Returns rich metadata: version, title, description, author, maintainer, license, dependencies, imports, and canonical RDocumentation links

Input

FieldTypeDescriptionDefault
modestringsearchPackages or getByNamesearchPackages
searchQuerystringKeyword to search (mode=searchPackages)ggplot
packageNamesstring[]Exact package names (mode=getByName)[]
maxItemsintegerMax records to return (1–500)50

Example: search packages

{
"mode": "searchPackages",
"searchQuery": "ggplot",
"maxItems": 20
}

Example: get by name

{
"mode": "getByName",
"packageNames": ["ggplot2", "dplyr", "tidyr", "data.table"],
"maxItems": 10
}

Output

Each record contains:

FieldTypeDescription
packageNamestringCRAN package name
versionstringLatest version
titlestringShort one-line title
descriptionstringFull description
authorstringPackage authors
maintainerstringCurrent maintainer with email
licensestringLicense string (e.g. MIT, GPL-2)
typestringRepository type (CRAN, Bioconductor)
dependsstring[]Packages listed in Depends (R excluded)
importsstring[]Packages listed in Imports
repositoryUrlstringFirst URL from URL field
bugReportsUrlstringBug tracker URL
sourceUrlstringhttps://www.rdocumentation.org/packages/{name}
publishedAtstringPublication date if available
recordTypestringAlways "package"
scrapedAtstringISO 8601 timestamp of scrape

Sample output record

{
"packageName": "ggplot2",
"version": "4.0.3",
"title": "Create Elegant Data Visualisations Using the Grammar of Graphics",
"description": "A system for 'declaratively' creating graphics, based on The Grammar of Graphics...",
"author": "Hadley Wickham [aut], Winston Chang [aut], Thomas Lin Pedersen [aut, cre]...",
"maintainer": "Thomas Lin Pedersen <thomas.pedersen@posit.co>",
"license": "MIT + file LICENSE",
"type": "CRAN",
"imports": ["cli", "grDevices", "grid", "gtable", "isoband", "lifecycle", "rlang", "scales"],
"repositoryUrl": "https://ggplot2.tidyverse.org",
"bugReportsUrl": "https://github.com/tidyverse/ggplot2/issues",
"sourceUrl": "https://www.rdocumentation.org/packages/ggplot2",
"recordType": "package",
"scrapedAt": "2026-05-30T10:00:00+00:00"
}

FAQs

Q: Does this require an API key or proxy? No. The scraper uses the public crandb.r-pkg.org API, which is free and open.

Q: Does this return ALL versions of a package? No. It returns metadata for the latest published version only.

Q: How does search work? The scraper performs a prefix-range query on the crandb API, returning all packages whose names start with your search term. It also falls back to a substring match across all CRAN package names.

Q: How many packages are on CRAN? As of 2026, CRAN hosts over 20,000 packages.

Q: Can I fetch Bioconductor packages? Currently the scraper covers CRAN only (via crandb). Bioconductor packages may not appear.

Q: What is the rate limit? The crandb API is generous for public usage. The scraper adds small delays between requests to be polite.