π¦ GitLab Scraper β Projects & Repository Data
Pricing
from $20.00 / 1,000 results
π¦ GitLab Scraper β Projects & Repository Data
Extract GitLab project data β stars, forks, issues, merge requests, contributor stats. GitHub Stats, Sourcegraph & OpenHub alternative for dev analytics, OSS intelligence and engineering dashboards. Pay per project, no token needed.
Pricing
from $20.00 / 1,000 results
Rating
0.0
(0)
Developer
Stephan Corbeil
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
π¦ GitLab Scraper β Projects, Merge Requests & Issues vs GitLab API
Pay-per-result GitLab scraper β extracts full project metadata, stars, forks, commit cadence, merge-request stats, issue counts, and CI/CD pipeline history from public GitLab.com and self-hosted GitLab instances. Built for OSS scouts, dev-tool competitive intel, and procurement researchers as a no-token-rotation alternative to the official GitLab REST API (10 req/sec authenticated), GraphQL API (point-budget), Sourcegraph Cloud ($299-1000+/mo), and SaaS aggregators like LinearB ($25-79/user/mo) and Velocity by Code Climate ($499+/mo).
Why GitLab Scraper Beats the GitLab API, Sourcegraph & LinearB
| Feature | NexGenData GitLab Scraper | GitLab official API | Sourcegraph Cloud | LinearB |
|---|---|---|---|---|
| Cost | $0.002 / project, pay-per-result | Free + rate-limited | $299-1000+ / month | $25-79 / user / month |
| Rate limit | None for end user | 10 req/sec authenticated | Plan-dependent | Plan-dependent |
| Self-hosted GitLab support | Yes β set instance URL | Yes (with PAT) | Yes (with PAT) | Yes (with PAT) |
| Auth | Apify token + optional GitLab PAT | GitLab PAT required for most ops | Account + PAT | Account + PAT |
| Bulk export | Direct dataset β JSON/CSV/Excel | Per-call REST + pagination | UI + limited API | API |
| Cross-org / cross-instance scan | Yes | No (instance-scoped) | Yes | Yes |
| Free trial | Free Apify credits | Free for low volume | 30-day trial | 30-day trial |
OSS scouts, devtool marketers, and procurement researchers pick this actor instead of building their own GitLab PAT-rotation rig because at >2000 projects the official API's per-second cap forces multi-hour scans. It is a drop-in alternative to Sourcegraph Cloud for "I just need GitLab project metadata for my BI tool" β the bare data without the UI overhead.
What You Get Per Project
Each dataset item is a flat JSON record:
id,namespace,path,web_url,descriptionstars,forks,open_issues,closed_issues,open_mrs,merged_mrsprimary_language,language_breakdowntopics,license,default_branch,visibilitycreated_at,last_activity_at,last_release_atcommits_last_90d,unique_committers_last_90dtop_committersβ array of{username, name, commits}merge_request_statsβ average lead time, throughput, review latencypipeline_success_rate_90d,last_pipeline_statusreadme_text,archived,mirror,forked_from
Use Cases
- OSS scouts β find growing GitLab-hosted projects that don't show up on GitHub trending
- Devtool marketers β discover teams running self-hosted GitLab for outbound targeting
- DevOps procurement β benchmark a target company's CI velocity before pitching them
- Migration consultancies β quantify the scope of a GitLab β GitHub migration RFP
- Open-source maintainers β track fork activity and downstream contributions across instances
- Investor diligence β verify "X projects on GitLab" claims with raw data
- Internal-platform teams β audit your own org's GitLab health across thousands of repos
Quick Start
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("nexgendata/gitlab-scraper").call(run_input={"instance": "https://gitlab.com","search": "kubernetes","minStars": 10,"maxResults": 1000,"includeMergeRequestStats": True})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["path"], item["stars"], item["commits_last_90d"])
Pricing
Pay-per-event β no PAT-rotation rig required, no monthly minimum.
- Actor Start: $0.0001
- Per project: $0.002
A 1000-project search costs about $2. Equivalent GitLab-API rate-respecting scan takes 2+ hours of PAT juggling.
Related NexGenData Actors
| Use case | Actor |
|---|---|
| GitHub repos + stars + contributors | GitHub Scraper |
| Daily / weekly trending repos | GitHub Trending Scraper |
| Deep stargazer-history analytics | GitHub Repo Stats |
| Docker Hub image pull counts | Docker Hub Scraper |
| Dev.to articles & dev audience | Dev.to Scraper |
| Developer tools MCP server | Developer Tools MCP Server |
| Company tech-stack detector | Company Tech Stack Detector |
| StackOverflow Q&A trends | StackOverflow Questions |
FAQ
Q: Does this work with self-hosted GitLab?
Yes β set instance: "https://gitlab.yourcompany.com" and supply a PAT in gitlabToken. The actor uses the same API surface across self-hosted and gitlab.com.
Q: Does this need a GitLab PAT? Only for self-hosted, private projects, or higher request volumes. Public gitlab.com searches work without one.
Q: How does this compare to the GitLab GraphQL API? GraphQL gives you per-call efficiency but ties you to a point budget. This actor flattens the data, runs the multi-call dance for you, and returns one row per project.
Q: Can I get pipeline history?
Yes β includePipelineHistory: true adds the last 90 days of pipeline runs per project.
Q: Schema stability? Field names are versioned per actor release. We track GitLab major API versions.
Q: What about merge-request review-latency metrics?
merge_request_stats includes avg_lead_time_hours, avg_review_latency_hours, and throughput_per_week. These are computed from the public MR events stream.
About NexGenData
NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b
How NexGenData Pricing Works
Every NexGenData actor uses pay-per-event pricing β you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.
- Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
- Result / item: charged per item written to the default dataset
- No charge for retries, internal proxy rotation, or failed sub-requests β those are absorbed by the platform
Apify Platform Bonus
New to Apify? Sign up with the NexGenData referral link β you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.
Integration Surface
Every actor in the NexGenData catalog can be triggered from:
- Apify console β point-and-click run
- Apify API β REST + webhooks
- Apify Python / JS SDKs β programmatic batch
- Zapier, Make.com, n8n β official integrations
- MCP β many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
- Schedules β built-in cron for daily / weekly / monthly runs
- Webhooks β POST results to any HTTPS endpoint on dataset write
Support
NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome β high-demand features ship in the next version.
Home: thenextgennexus.com Full catalog: apify.com/nexgendata