GitLab Public Projects Scraper | Stars, Forks, Topics
Pricing
Pay per usage
GitLab Public Projects Scraper | Stars, Forks, Topics
Harvest records from multiple Gitlab sources in a single run and get a unified, normalized result set. Pull names, identifiers, dates, descriptions, status flags and source links per record. Perfect for research, lead generation and intelligence pipelines.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share

🦊 GitLab Public Projects Scraper
🚀 Pull public GitLab projects with stars, forks, topics, and owners in seconds. Built on the official GitLab REST API.
🕒 Last updated: 2026-05-27 · 📊 25 fields per record · All public gitlab.com projects · Search, sort, and topic filters
The GitLab Public Projects Scraper queries the official gitlab.com/api/v4/projects endpoint and returns one normalized record per public project. Useful for tracking open-source DevOps tooling, discovering self-hosted alternatives to GitHub repos, monitoring topic communities (kubernetes, gitops, AI), or building competitive intelligence on the open-source ecosystem.
Coverage: every public project on gitlab.com. Filters by search query, topic, and sort order (stars, last activity, created date). Up to 1,000,000 records per run on the paid plan.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Developer relations | Map ecosystem of related OSS projects |
| Security teams | Find packages by topic or maintainer |
| Recruiters | Spot top GitLab contributors |
| Researchers | Study OSS contribution patterns |
📋 What the GitLab Public Projects Scraper does
- Queries the public GitLab REST API directly
- Returns 25 normalized fields per project (name, path, stars, forks, topics, license, owner...)
- Supports search, sort, topic, and ascending/descending order
- Outputs to multiple table outputs via Apify dataset
- Auto-limits to 10 items on the free plan; up to 1,000,000 on paid
💡 Why it matters: GitLab hosts millions of projects but no public search UI exposes them at scale. This Actor turns that catalog into a queryable dataset.
🎬 Full Demo (🚧 Coming soon)
⚙️ Input
| Field | Type | Description |
|---|---|---|
| search | string | Search term, e.g. react |
| maxItems | integer | Cap on rows (free: 10) |
| orderBy | enum | star_count, last_activity_at, created_at, etc. |
| sort | enum | asc / desc |
| topic | string | Filter by topic, e.g. kubernetes |
{ "search": "react", "maxItems": 100, "orderBy": "star_count", "sort": "desc" }
{ "topic": "kubernetes", "orderBy": "last_activity_at", "maxItems": 200 }
⚠️ Good to Know: Without a token, GitLab's API rate-limits to ~10 requests per second per IP. The Actor paces requests automatically.
📊 Output
| Field | Type | Description |
|---|---|---|
| 🖼 imageUrl | string | Project avatar URL |
| 📌 name | string | Project name |
| 📌 nameWithNamespace | string | Full namespaced name |
| 🔗 url | string | gitlab.com web URL |
| 🆔 id | integer | GitLab project ID |
| 📁 path | string | URL slug |
| 📁 pathWithNamespace | string | Full path with namespace |
| 📝 description | string | Project description |
| 🌿 defaultBranch | string | Default branch name |
| 👁 visibility | string | public / internal |
| ⭐ starCount | number | Stars |
| 🍴 forksCount | number | Forks |
| 🐛 openIssuesCount | number | Open issues |
| 🏷 topics | array | Topics list |
| 🏷 tagList | array | Tags |
| 📜 license | string | License name |
| 🕒 createdAt | string | Created ISO timestamp |
| 🕒 lastActivityAt | string | Last activity ISO timestamp |
| 🔗 readmeUrl | string | README URL |
| 🔑 sshUrl | string | SSH clone URL |
| 🔗 httpUrl | string | HTTPS clone URL |
| 👤 owner | string | Namespace name |
| 👤 ownerPath | string | Namespace path |
| 👤 ownerKind | string | user / group |
| 🕒 scrapedAt | string | ISO timestamp |
| ❌ error | string | null | Error message if extraction failed |
✨ Why choose this Actor
- Direct GitLab REST API, no third-party caching
- Real-time data, never stale
- Pay-per-result pricing; only charged for what you keep
- Works with Make, Zapier, n8n, Airbyte, GitHub Actions, Google Sheets
📈 How it compares to alternatives
| Approach | Cost | Maintenance | Coverage |
|---|---|---|---|
| GitLab UI search | Free | Manual | One page at a time |
| Self-built API client | Dev cost | High | Custom |
| This actor | Pay per result | None | Full search + filters |
🚀 How to use
- Create a free Apify account w/ $5 credit
- Open the GitLab Public Projects Scraper actor page
- Set
search,orderBy, optionaltopic, andmaxItems - Click Start and wait for the run to finish
- Use the dataset as multiple table outputs
💼 Business use cases
Developer relations
| Need | How this Actor helps |
|---|---|
| Ecosystem mapping | Pull all projects by topic |
| Influencer tracking | Sort by stars to find top owners |
Security
| Need | How this Actor helps |
|---|---|
| Supply chain audit | Scan public projects in your topic |
| License compliance | Filter by license field |
Recruiting
| Need | How this Actor helps |
|---|---|
| Find OSS contributors | Sort by activity, drill into owner |
| Skill mapping | Filter by topic, language |
Research
| Need | How this Actor helps |
|---|---|
| OSS trends | Bulk pull over time |
| Comparative analysis | GitLab vs GitHub trends |
🔌 Automating GitLab Public Projects Scraper
Run on a schedule, forward results to Make, Zapier, n8n, Slack, Airbyte, GitHub Actions, or Google Drive. Push new high-star projects into a Slack channel daily.
🌟 Beyond business use cases
Research
Build a longitudinal study of OSS topic growth over months/years.
Personal
Discover obscure self-hosted alternatives to your favourite SaaS tools.
Non-profit
Track open civic-tech projects by topic (e.g. civic-tech, accessibility).
Experimentation
Build a "GitLab radar" Slack bot for newly active projects in your space.
🤖 Ask an AI assistant about this scraper
❓ Frequently Asked Questions
Q: Does it need a GitLab token? No. Public endpoints work anonymously.
Q: Can I scrape private projects? No - only visibility: public is returned.
Q: How fresh is the data? Live - every request hits gitlab.com.
Q: Can I filter by language? Use the topic field - many projects tag languages (e.g. python, rust).
Q: What's the rate limit? ~10 RPS anonymous; the Actor paces well under it.
Q: Can I scrape self-hosted GitLab? This Actor targets gitlab.com specifically. Self-hosted instances would need a forked Actor.
Q: Does it include archived projects? Yes - check archived field if needed.
Q: How do I find a topic slug? Browse gitlab.com/explore/projects/topics.
Q: Can I sort by forks? orderBy supports id, name, path, created_at, updated_at, last_activity_at, star_count. Forks not supported by upstream API.
Q: Does it return README contents? No - just readmeUrl. Fetch separately if needed.
🔌 Integrate with any app
Slack, Discord, Sheets, Airtable, BigQuery, S3, Snowflake, and 100+ more via Apify webhooks.
🔗 Recommended Actors
| Actor | What it does |
|---|---|
| GitHub Trending Scraper | Daily trending repos |
| Hacker News Scraper | Top tech stories |
| npm Packages Scraper | npm metadata |
| Mastodon Trends Scraper | Fediverse trends |
💡 Pro Tip: browse the complete ParseForge collection.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: independent tool, not affiliated with GitLab Inc. Only publicly available data is collected.