CRAN R Packages Scraper - Metadata, Dependencies & Analytics avatar

CRAN R Packages Scraper - Metadata, Dependencies & Analytics

Pricing

Pay per usage

Go to Apify Store
CRAN R Packages Scraper - Metadata, Dependencies & Analytics

CRAN R Packages Scraper - Metadata, Dependencies & Analytics

Extract comprehensive metadata from CRAN (Comprehensive R Archive Network) packages including descriptions, versions, dependencies, reverse dependencies, publication dates, authors, DOIs, vignettes, and download links. Perfect for R ecosystem research, dependency analysis, and package discovery.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Pierrick McD0nald

Pierrick McD0nald

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

CRAN R Packages Scraper — Metadata, Dependencies & Analytics

Extract comprehensive metadata from CRAN (Comprehensive R Archive Network) packages. This Actor scrapes package detail pages to collect descriptions, versions, dependencies, reverse dependencies, publication dates, authors, DOIs, vignettes, download links, and more. Perfect for R ecosystem research, dependency analysis, package discovery, and academic data collection.

Use Cases

  • R Ecosystem Research — Analyze the CRAN package landscape, identify trending packages, and study dependency networks across the R statistical computing environment.
  • Dependency Analysis — Map reverse dependencies to understand which packages rely on a given library, useful for security auditing and impact assessment.
  • Package Discovery — Build curated lists of R packages by category, author, or publication date for research or teaching purposes.
  • Academic Data Collection — Collect structured metadata from CRAN for bibliometric analysis, reproducibility studies, and software citation research.

Input

FieldTypeRequiredDescription
packageNamesArrayYesList of CRAN package names to scrape (e.g., ggplot2, dplyr). Leave empty to scrape all packages.
maxItemsNumberNoMaximum packages to scrape (default: 100, 0 for unlimited).
includeDetailsBooleanNoScrape full detail pages (true) or just list names (false).
proxyConfigurationObjectNoProxy configuration. Apify proxy included by default.

Output

The Actor outputs a dataset with the following fields:

{
"packageName": "ggplot2",
"title": "Create Elegant Data Visualisations Using the Grammar of Graphics",
"description": "A system for declaratively creating graphics...",
"version": "4.0.3",
"depends": "R (>= 4.1)",
"imports": "cli, grDevices, grid, gtable, isoband, lifecycle, rlang, S7, scales, stats, vctrs, withr",
"suggests": "broom, covr, dplyr, hexbin, Hmisc, hms, knitr, MASS, mgcv, multcomp, munsell, nlme, profvis, quantreg, quarto, ragg, RColorBrewer, roxygen2, rpart, sf, svglite, testthat, tibble, vdiffr, xml2",
"enhances": "sp",
"publishedDate": "2026-04-22",
"author": "Hadley Wickham, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, Dewey Dunnington, Teun van den Brand",
"maintainer": "Thomas Lin Pedersen <thomasp85@gmail.com>",
"doi": "10.32614/CRAN.package.ggplot2",
"url": "https://cran.r-project.org/web/packages/ggplot2/index.html",
"license": "MIT + file LICENSE",
"needsCompilation": "no",
"inViews": "ChemPhys, NetworkAnalysis, Phylogenetics, Spatial, TeachingStatistics",
"cranChecks": "ggplot2 results",
"referenceManual": "refman/ggplot2.html, ggplot2.pdf",
"vignettes": "Extending ggplot2, Using ggplot2 in packages, Aesthetic specifications, Introduction to ggplot2, Profiling Performance",
"materials": "README, NEWS",
"citation": "ggplot2 citation info",
"packageSource": "ggplot2_4.0.3.tar.gz",
"windowsBinaries": "r-devel: ggplot2_4.0.3.zip, r-release: ggplot2_4.0.3.zip, r-oldrel: ggplot2_4.0.3.zip",
"macosBinaries": "r-release (arm64): ggplot2_4.0.3.tgz, r-oldrel (arm64): ggplot2_4.0.3.tgz, r-release (x86_64): ggplot2_4.0.3.tgz, r-oldrel (x86_64): ggplot2_4.0.3.tgz",
"oldSources": "https://CRAN.R-project.org/src/contrib/Archive/ggplot2",
"reverseDepends": "accessrmd, afmToolkit, alakazam, alookr, AmpliconDuo, Anaconda, Anaquin, apisensr, applicable, ausplotsR, bacon, BasketballAnalyzeR, bayesDP, bayesnec, bbnet, bde, bhm, bootnet, bpcp, braidReports, bunching, CalibrationCurves, caret, CellNOptR, ceterisParibus, cfda, changepoint.geo, changeS, CHETAH, ChIPQC, circhelp, cjoint, ClassificationEnsembles, classifierplots, clustEff, ClusteredMutations, clustrd, CNVrd2, CNVScope, coefplot, cogena, cohorttools, colleyRstats, ConconiAnaerobicThresholdTest, ContourFunctions, corkscrew, CoSMoS, CRABS, CrispRVariants, crmPack, Crossover, CRTgeeDR, crumblr, CTxCC, curatedBreastData, cystiSim, cytofan, dae, DaMiRseq, dampack, dartR, dartR.base, dartR.sim, ddecompose, decompTumor2Sig, Deducer, deltaGseg, DendroSync, DepthProc, DEqMS, DHBins, diathor, diffEnrich, diffeR, DiSCos, dittoSeq, dittoViz, dnn, donutsk, dotwhisker, dowser, dpGMM, dreamlet, dslice, dynr, Eagle, echoice2, eeptools, egg, embryogrowth, EnhancedVolcano, EnsCat, EpiCurve, episensr, EQUALCompareImages, EQUALPrognosis, EQUALrepeat, erccdashboard, escheR, eVCGsampler, extraChIPs, FactoClass, factoextra, factorplot, Factoshiny, fbroc, findGSEP, FisherEM, flippant, ForecastingEnsembles, forestmodel, FormulR, freqparcoord, frequency, func2vis, funMoDisco, gam.hp, gapmap, garma, GARS, gcerisk, gde, GenericML, genlogis, GenomicOZone, geomtextpath, geotoolsR, GerminaR, gg4way, ggalign, ggallin, ggalluvial, GGally, gganimate, ggarrow, ggbeeswarm, ggbio, ggbiplot, ggbuildr, ggcharts, ggcorrplot, ggcube, ggcyto, ggdemetra, ggdensity, ggetho, ggExametrika, ggFishPlots, ggfixest, ggfocus, ggforce, ggformula, ggfortify, ggfoundry, gggda, gggenomes, ggghost, gggibbous, gggrid, ggh4x, gghighlight, ggHoriPlot, ggimage, ggincerta, gginnards, ggInterval, ggip, ggkegg, gglm, gglorenz, ggmanh, ggmap, ggmapcn, ggmatplot, ggmcmc, ggmulti, ggnetwork, ggOceanMaps, ggordiplots, ggpackets, ggparty, ggplot2.utils, ggpointless, ggpolar, ggpolypath, ggpp, ggpubr, ggragged, ggrain, ggraph, ggraptR, ggResidpanel, ggROC, ggsdc, ggsignif, ggsom, ggspatial, ggsurvfit, ggtext, ggthemes, ggthreed, ggtibble, ggtidy, ggtree, ggtrendline, ggunify, ggupset, ggvenn, ggVennDiagram, ggvis, ggwordcloud, ggx, ggxtend, ghiblipalettes, Gifi, ggsoccer, ggsolvencyii, ggstance, ggstats, ggstatsplot, ggsteam, ggstream, ggsubplot, ggswissmaps, ggtern, ggtexttable, ggTimeSeries, ggtrend, ggupset, ggvis, ggwordcloud, ggx, ggxtend",
"reverseImports": "",
"reverseSuggests": "",
"reverseEnhances": ""
}

Pricing

Pay per event: $0.001 per package extracted.

Limitations

  • Scraping all ~20,000 CRAN packages may take significant time and compute. Use maxItems to limit scope.
  • Package pages are static HTML; no JavaScript rendering required.
  • CRAN rate limits are generous but respect them by using the built-in proxy configuration.
  • Some packages may have missing fields (e.g., no DOI, no vignettes) which will be returned as empty strings.

FAQ

Q: Can I scrape all CRAN packages at once? A: Yes. Leave packageNames empty and set maxItems to 0. This will fetch all ~20,000 packages. Consider using a higher compute tier for large runs.

Q: How do I find package names? A: Package names are the exact CRAN identifiers (e.g., ggplot2, not ggplot 2). You can find them on CRAN or by running the Actor with an empty packageNames list to discover them.

Q: What is the difference between includeDetails true and false? A: When includeDetails is true, the Actor visits each package's detail page and extracts full metadata. When false, it only extracts name and title from the package list page (much faster but less data).

Changelog

  • v1.0.0 — Initial release. Scrape CRAN package metadata including dependencies, reverse dependencies, authors, DOIs, vignettes, and download links.