Maven Central Scraper — Java Package & Artifact Extractor avatar

Maven Central Scraper — Java Package & Artifact Extractor

Pricing

Pay per usage

Go to Apify Store
Maven Central Scraper — Java Package & Artifact Extractor

Maven Central Scraper — Java Package & Artifact Extractor

Search and extract Java package metadata from Maven Central Repository. Get artifact details, versions, timestamps, and dependency info via the public Solr API.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Pierrick McD0nald

Pierrick McD0nald

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Extract Java package metadata from the Maven Central Repository via its public Solr API. Search by keyword, group ID, or artifact ID to retrieve artifact details, versions, timestamps, packaging info, and tags. Ideal for dependency analysis, security auditing, and Java ecosystem research.

Use Cases

  • Dependency Research — Find artifacts by group ID or artifact ID for build tool configuration
  • Security Auditing — Extract version counts and publication dates to identify outdated dependencies
  • Ecosystem Analysis — Batch-search Maven Central to map Java library trends and popularity
  • License Compliance — Gather artifact metadata for open-source compliance reporting

Input

FieldTypeRequiredDescription
searchQueriesArrayNoSearch terms to query Maven Central (e.g., ["gson", "junit"]). Each term returns up to maxResults artifacts.
groupIdStringNoExact Maven groupId to filter by (e.g., com.google.code.gson).
artifactIdStringNoExact Maven artifactId to filter by (e.g., gson).
includeAllVersionsBooleanNoIf true, fetches all published versions for each artifact. Default: false.
maxResultsIntegerNoMaximum artifacts per query (1–1000). Default: 100.
proxyConfigurationObjectNoProxy settings for requests. Built-in proxy support included by default.

Output

The Actor outputs a dataset with the following fields:

{
"groupId": "com.google.code.gson",
"artifactId": "gson",
"latestVersion": "2.14.0",
"version": "2.14.0",
"packaging": "jar",
"repositoryId": "central",
"versionCount": 44,
"timestamp": 1776970907000,
"publishedAt": "2026-05-10T12:21:47.000Z",
"tags": ["library", "json", "gson"],
"searchQuery": "gson"
}

When includeAllVersions is enabled, each version becomes a separate row with its own version and publishedAt fields.

Pricing

Pay per event: $0.002 per artifact scraped.

Charging is applied after each batch of results. The Actor respects spending limits and stops gracefully when the limit is reached.

Limitations

  • Maven Central API has no documented rate limits, but excessive volume may trigger throttling. Use reasonable maxResults values.
  • The API returns only artifacts published to Maven Central. Private repositories or other hosts (JCenter, GitHub Packages) are not covered.
  • includeAllVersions generates additional API calls. Large artifact histories may increase run time.

FAQ

Q: Do I need a Maven Central API key? A: No. This Actor uses the public Solr search endpoint which requires no authentication.

Q: Can I search by partial artifact name? A: Yes. Use searchQueries with partial names. The Actor queries Maven Central's search index which supports partial matching.

Q: What is the difference between latestVersion and version? A: latestVersion is the most recent version known for the artifact. version is the specific version of that row. When includeAllVersions is disabled, they are identical.

Q: How do I find all artifacts in a group? A: Set groupId to the exact group (e.g., org.apache.commons) and leave searchQueries empty.

Changelog

  • v1.0.0 — Initial release. Search by query, groupId, or artifactId. Optional full version history.