Bioconductor Scraper avatar

Bioconductor Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Bioconductor Scraper

Bioconductor Scraper

Scrape Bioconductor R package metadata - title, description, biocViews categories, author, maintainer, license, dependencies, and more. Search by keyword or biocViews category, or fetch specific packages by name.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Scrape Bioconductor R package metadata. Search packages by keyword, filter by biocViews category (e.g. "RNASeq", "Sequencing", "Microarray"), or fetch specific packages by name — get version, title, description, author, maintainer, license, dependencies, and more.

Features

  • Search packages by keyword across name, title, and description
  • Filter by biocViews category (e.g. "RNASeq", "GenomicData", "Sequencing")
  • Fetch specific packages by exact name
  • Extracts: name, version, title, description, biocViews, author, maintainer, license, depends, imports, suggests, URL, bug report link
  • Uses official Bioconductor JSON API — no authentication required
  • Covers the full Bioconductor release repository (2,000+ packages)

Input

FieldTypeDescriptionDefault
modeString (select)searchPackages or getByNamesearchPackages
searchQueryStringKeyword search in name/title/descriptionRNA-seq
packageNamesArray of stringsExact package names for getByName mode[]
biocViewsStringFilter by biocViews category(none)
maxItemsInteger (1–1000)Maximum number of records to return10

Mode: searchPackages

Search Bioconductor packages whose name, title, or description contains the query string (case-insensitive). Optionally combine with a biocViews filter.

Example input:

{
"mode": "searchPackages",
"searchQuery": "RNA-seq",
"maxItems": 20
}

With biocViews filter:

{
"mode": "searchPackages",
"biocViews": "Sequencing",
"maxItems": 50
}

Mode: getByName

Fetch specific packages by their exact Bioconductor package name (case-insensitive).

Example input:

{
"mode": "getByName",
"packageNames": ["DESeq2", "edgeR", "limma"],
"maxItems": 10
}

Output

Each record contains:

FieldTypeDescription
packageNameStringBioconductor package name
versionStringCurrent release version
titleStringShort package title
descriptionStringFull package description
biocViewsArrayBioconductor category tags
authorStringPackage author(s) with roles
maintainerStringCurrent maintainer with email
licenseStringLicense type
dependsArrayHard dependencies
importsArrayImported packages
suggestsArrayOptional suggested packages
vignettesArrayVignette builder tools
urlStringProject URL
bugReportsStringBug tracker URL
sourceUrlStringBioconductor package page URL
recordTypeStringAlways "package"
scrapedAtStringISO 8601 timestamp

Sample Output Record

{
"packageName": "DESeq2",
"version": "1.42.0",
"title": "Differential gene expression analysis based on the negative binomial distribution",
"description": "Estimate variance-mean dependence in count data from RNA-Seq experiments, and test for differential expression based on a model using the negative binomial distribution.",
"biocViews": ["RNASeq", "Sequencing", "DifferentialExpression", "GeneExpression", "Transcriptomics"],
"author": "Michael Love [aut, cre], Simon Anders [aut], Wolfgang Huber [aut]",
"maintainer": "Michael Love <michaelisaiahlove@gmail.com>",
"license": "LGPL (>= 3)",
"depends": ["R", "S4Vectors", "IRanges", "GenomicRanges", "SummarizedExperiment"],
"imports": ["BiocGenerics", "Biobase", "locfit", "ggplot2"],
"suggests": ["knitr", "rmarkdown", "testthat"],
"vignettes": ["knitr"],
"url": "https://github.com/thelovelab/DESeq2",
"bugReports": "https://github.com/thelovelab/DESeq2/issues",
"sourceUrl": "https://bioconductor.org/packages/release/bioc/html/DESeq2.html",
"recordType": "package",
"scrapedAt": "2026-05-30T12:00:00+00:00"
}

Data Source

Data is sourced from the official Bioconductor packages JSON API. No authentication is required.

biocViews Categories

Common biocViews categories you can use as filters:

  • RNASeq — RNA sequencing analysis
  • Sequencing — Next-generation sequencing
  • Microarray — Microarray data analysis
  • GenomicData — Genomic data representation
  • Infrastructure — Core infrastructure packages
  • DifferentialExpression — Differential expression analysis
  • Proteomics — Proteomics data
  • Metabolomics — Metabolomics data
  • SingleCell — Single-cell RNA sequencing
  • Epigenetics — Epigenetic data analysis
  • VariantAnnotation — Genetic variant annotation

FAQs

Q: How many Bioconductor packages are available? A: The Bioconductor release repository contains over 2,200 software packages.

Q: What Bioconductor version does this use? A: The actor scrapes the current release (3.18). Data includes packages from bioconductor.org/packages/release/bioc/.

Q: Can I search across multiple biocViews categories? A: The biocViews filter accepts one category at a time. Use searchQuery for broader keyword searches.

Q: Are annotation and experiment packages included? A: The actor focuses on the main software package repository. Annotation and experiment data packages are not included.

Q: What does "depends" vs "imports" vs "suggests" mean? A: These are standard R package dependency levels. depends are hard dependencies loaded on attach; imports are used internally; suggests are optional extras for examples or vignettes.

Q: Why are version constraints removed from dependency lists? A: The actor strips version constraints (e.g. R (>= 4.0)) to provide clean package name lists. The full constraint information is in the package description page.