CRAN R Packages Scraper - Metadata, Dependencies & Analytics
Pricing
Pay per usage
CRAN R Packages Scraper - Metadata, Dependencies & Analytics
Extract comprehensive metadata from CRAN (Comprehensive R Archive Network) packages including descriptions, versions, dependencies, reverse dependencies, publication dates, authors, DOIs, vignettes, and download links. Perfect for R ecosystem research, dependency analysis, and package discovery.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Pierrick McD0nald
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
CRAN R Packages Scraper — Metadata, Dependencies & Analytics
Extract comprehensive metadata from CRAN (Comprehensive R Archive Network) packages. This Actor scrapes package detail pages to collect descriptions, versions, dependencies, reverse dependencies, publication dates, authors, DOIs, vignettes, download links, and more. Perfect for R ecosystem research, dependency analysis, package discovery, and academic data collection.
Use Cases
- R Ecosystem Research — Analyze the CRAN package landscape, identify trending packages, and study dependency networks across the R statistical computing environment.
- Dependency Analysis — Map reverse dependencies to understand which packages rely on a given library, useful for security auditing and impact assessment.
- Package Discovery — Build curated lists of R packages by category, author, or publication date for research or teaching purposes.
- Academic Data Collection — Collect structured metadata from CRAN for bibliometric analysis, reproducibility studies, and software citation research.
Input
| Field | Type | Required | Description |
|---|---|---|---|
packageNames | Array | Yes | List of CRAN package names to scrape (e.g., ggplot2, dplyr). Leave empty to scrape all packages. |
maxItems | Number | No | Maximum packages to scrape (default: 100, 0 for unlimited). |
includeDetails | Boolean | No | Scrape full detail pages (true) or just list names (false). |
proxyConfiguration | Object | No | Proxy configuration. Apify proxy included by default. |
Output
The Actor outputs a dataset with the following fields:
{"packageName": "ggplot2","title": "Create Elegant Data Visualisations Using the Grammar of Graphics","description": "A system for declaratively creating graphics...","version": "4.0.3","depends": "R (>= 4.1)","imports": "cli, grDevices, grid, gtable, isoband, lifecycle, rlang, S7, scales, stats, vctrs, withr","suggests": "broom, covr, dplyr, hexbin, Hmisc, hms, knitr, MASS, mgcv, multcomp, munsell, nlme, profvis, quantreg, quarto, ragg, RColorBrewer, roxygen2, rpart, sf, svglite, testthat, tibble, vdiffr, xml2","enhances": "sp","publishedDate": "2026-04-22","author": "Hadley Wickham, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, Dewey Dunnington, Teun van den Brand","maintainer": "Thomas Lin Pedersen <thomasp85@gmail.com>","doi": "10.32614/CRAN.package.ggplot2","url": "https://cran.r-project.org/web/packages/ggplot2/index.html","license": "MIT + file LICENSE","needsCompilation": "no","inViews": "ChemPhys, NetworkAnalysis, Phylogenetics, Spatial, TeachingStatistics","cranChecks": "ggplot2 results","referenceManual": "refman/ggplot2.html, ggplot2.pdf","vignettes": "Extending ggplot2, Using ggplot2 in packages, Aesthetic specifications, Introduction to ggplot2, Profiling Performance","materials": "README, NEWS","citation": "ggplot2 citation info","packageSource": "ggplot2_4.0.3.tar.gz","windowsBinaries": "r-devel: ggplot2_4.0.3.zip, r-release: ggplot2_4.0.3.zip, r-oldrel: ggplot2_4.0.3.zip","macosBinaries": "r-release (arm64): ggplot2_4.0.3.tgz, r-oldrel (arm64): ggplot2_4.0.3.tgz, r-release (x86_64): ggplot2_4.0.3.tgz, r-oldrel (x86_64): ggplot2_4.0.3.tgz","oldSources": "https://CRAN.R-project.org/src/contrib/Archive/ggplot2","reverseDepends": "accessrmd, afmToolkit, alakazam, alookr, AmpliconDuo, Anaconda, Anaquin, apisensr, applicable, ausplotsR, bacon, BasketballAnalyzeR, bayesDP, bayesnec, bbnet, bde, bhm, bootnet, bpcp, braidReports, bunching, CalibrationCurves, caret, CellNOptR, ceterisParibus, cfda, changepoint.geo, changeS, CHETAH, ChIPQC, circhelp, cjoint, ClassificationEnsembles, classifierplots, clustEff, ClusteredMutations, clustrd, CNVrd2, CNVScope, coefplot, cogena, cohorttools, colleyRstats, ConconiAnaerobicThresholdTest, ContourFunctions, corkscrew, CoSMoS, CRABS, CrispRVariants, crmPack, Crossover, CRTgeeDR, crumblr, CTxCC, curatedBreastData, cystiSim, cytofan, dae, DaMiRseq, dampack, dartR, dartR.base, dartR.sim, ddecompose, decompTumor2Sig, Deducer, deltaGseg, DendroSync, DepthProc, DEqMS, DHBins, diathor, diffEnrich, diffeR, DiSCos, dittoSeq, dittoViz, dnn, donutsk, dotwhisker, dowser, dpGMM, dreamlet, dslice, dynr, Eagle, echoice2, eeptools, egg, embryogrowth, EnhancedVolcano, EnsCat, EpiCurve, episensr, EQUALCompareImages, EQUALPrognosis, EQUALrepeat, erccdashboard, escheR, eVCGsampler, extraChIPs, FactoClass, factoextra, factorplot, Factoshiny, fbroc, findGSEP, FisherEM, flippant, ForecastingEnsembles, forestmodel, FormulR, freqparcoord, frequency, func2vis, funMoDisco, gam.hp, gapmap, garma, GARS, gcerisk, gde, GenericML, genlogis, GenomicOZone, geomtextpath, geotoolsR, GerminaR, gg4way, ggalign, ggallin, ggalluvial, GGally, gganimate, ggarrow, ggbeeswarm, ggbio, ggbiplot, ggbuildr, ggcharts, ggcorrplot, ggcube, ggcyto, ggdemetra, ggdensity, ggetho, ggExametrika, ggFishPlots, ggfixest, ggfocus, ggforce, ggformula, ggfortify, ggfoundry, gggda, gggenomes, ggghost, gggibbous, gggrid, ggh4x, gghighlight, ggHoriPlot, ggimage, ggincerta, gginnards, ggInterval, ggip, ggkegg, gglm, gglorenz, ggmanh, ggmap, ggmapcn, ggmatplot, ggmcmc, ggmulti, ggnetwork, ggOceanMaps, ggordiplots, ggpackets, ggparty, ggplot2.utils, ggpointless, ggpolar, ggpolypath, ggpp, ggpubr, ggragged, ggrain, ggraph, ggraptR, ggResidpanel, ggROC, ggsdc, ggsignif, ggsom, ggspatial, ggsurvfit, ggtext, ggthemes, ggthreed, ggtibble, ggtidy, ggtree, ggtrendline, ggunify, ggupset, ggvenn, ggVennDiagram, ggvis, ggwordcloud, ggx, ggxtend, ghiblipalettes, Gifi, ggsoccer, ggsolvencyii, ggstance, ggstats, ggstatsplot, ggsteam, ggstream, ggsubplot, ggswissmaps, ggtern, ggtexttable, ggTimeSeries, ggtrend, ggupset, ggvis, ggwordcloud, ggx, ggxtend","reverseImports": "","reverseSuggests": "","reverseEnhances": ""}
Pricing
Pay per event: $0.001 per package extracted.
Limitations
- Scraping all ~20,000 CRAN packages may take significant time and compute. Use
maxItemsto limit scope. - Package pages are static HTML; no JavaScript rendering required.
- CRAN rate limits are generous but respect them by using the built-in proxy configuration.
- Some packages may have missing fields (e.g., no DOI, no vignettes) which will be returned as empty strings.
FAQ
Q: Can I scrape all CRAN packages at once?
A: Yes. Leave packageNames empty and set maxItems to 0. This will fetch all ~20,000 packages. Consider using a higher compute tier for large runs.
Q: How do I find package names?
A: Package names are the exact CRAN identifiers (e.g., ggplot2, not ggplot 2). You can find them on CRAN or by running the Actor with an empty packageNames list to discover them.
Q: What is the difference between includeDetails true and false?
A: When includeDetails is true, the Actor visits each package's detail page and extracts full metadata. When false, it only extracts name and title from the package list page (much faster but less data).
Changelog
- v1.0.0 — Initial release. Scrape CRAN package metadata including dependencies, reverse dependencies, authors, DOIs, vignettes, and download links.