CPAN Module Scraper avatar

CPAN Module Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
CPAN Module Scraper

CPAN Module Scraper

Scrape CPAN (Comprehensive Perl Archive Network) via MetaCPAN API. Search modules and releases, or fetch by exact module name. Returns version, author, license, repository, and download info.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

4 days ago

Last modified

Share

Scrape module and release data from CPAN (Comprehensive Perl Archive Network) via the MetaCPAN public API. No authentication required.

What can it do?

  • Search modules by keyword, returning the latest indexed version of each matching module
  • Search releases by keyword, filtered to the latest stable release of each distribution
  • Fetch specific modules by exact name (e.g. Moose, DBI, Catalyst)
  • Returns rich metadata: version, author, license, repository URL, bug tracker, and download info

Input

FieldTypeDescription
modesearchModules | searchReleases | getByNameOperation mode. Default: searchModules
searchQuerystringSearch query for modules or releases. Default: Moose
moduleNamesstring[]Exact module names for getByName mode (e.g. Moose, DBI)
maxItemsinteger (1–500)Maximum records to return. Default: 50

Mode descriptions

ModeDescription
searchModulesFull-text search across CPAN module index, latest versions only
searchReleasesSearch releases (distributions) by name, latest stable only
getByNameFetch one or more modules by their exact CPAN name

Example input — Search modules

{
"mode": "searchModules",
"searchQuery": "Moose",
"maxItems": 50
}

Example input — Search releases

{
"mode": "searchReleases",
"searchQuery": "Catalyst",
"maxItems": 25
}

Example input — Fetch by name

{
"mode": "getByName",
"moduleNames": ["Moose", "DBI", "Catalyst"]
}

Output

Each record in the dataset represents one CPAN module or release.

FieldTypeDescription
moduleNamestringMain module name (.pm suffix removed)
distributionNamestringCPAN distribution name (e.g. Moose, DBIx-Class)
descriptionstringModule abstract / short description
versionstringModule version
authorstringCPAN author ID (PAUSE ID)
licensestring[]License identifiers (release mode only)
releaseDatestringISO timestamp of this release
statusstringIndex status: latest, cpan, or backpan
mainModulestringPrimary module name of the distribution (release mode)
repositoryUrlstringSource code repository URL
homepagestringProject homepage URL
bugTrackerstringBug tracker URL
downloadUrlstringDirect download URL of the .tar.gz archive
sourceUrlstringMetaCPAN distribution page URL
recordTypestringAlways "module"
scrapedAtstringISO timestamp when record was scraped

Example output record

{
"moduleName": "Moose",
"distributionName": "Moose",
"description": "A postmodern object system for Perl 5",
"version": "2.4000",
"author": "ETHER",
"license": ["perl_5"],
"releaseDate": "2025-07-04T21:24:15",
"status": "latest",
"mainModule": "Moose",
"repositoryUrl": "https://github.com/moose/Moose",
"homepage": "http://moose.perl.org/",
"bugTracker": "https://rt.cpan.org/Dist/Display.html?Name=Moose",
"downloadUrl": "https://cpan.metacpan.org/authors/id/E/ET/ETHER/Moose-2.4000.tar.gz",
"sourceUrl": "https://metacpan.org/dist/Moose",
"recordType": "module",
"scrapedAt": "2025-05-30T10:00:00+00:00"
}

Frequently Asked Questions

Does this require an API key? No. The MetaCPAN API is fully public and does not require authentication.

What is the difference between searchModules and searchReleases? searchModules searches at the individual Perl module level (.pm files). searchReleases searches at the distribution level — a distribution is a tarball that may contain many modules. For most use cases, searchReleases gives cleaner, one-record-per-package results.

Why does a search for "Moose" return modules from other distributions? CPAN modules can be bundled inside other distributions. For example, a distribution may vendor Moose as a dependency. The search index includes all indexed files.

What does status: backpan mean? BackPAN is the historical archive of CPAN — packages that have been removed from the active CPAN index but are still available for download.

How fresh is the data? Each run fetches live data from the MetaCPAN API. MetaCPAN indexes new uploads within minutes of upload to CPAN.

Can I fetch a specific version of a module? The getByName mode always returns the latest indexed version. To get older versions, use searchReleases and filter results by version.