Github Pull Request Scraper Api
Pricing
$4.99/month + usage
Github Pull Request Scraper Api
Extract GitHub pull requests with commits, reviews, comments & merge data. Monitor PR velocity, track code review metrics, analyze team productivity. Export to JSON/CSV for DevOps analytics, CI/CD automation & reporting. No API token needed. Fast Playwright scraper for developers & managers.
Pricing
$4.99/month + usage
Rating
0.0
(0)
Developer

Brennan Crawford
Actor stats
0
Bookmarked
1
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
GitHub Pull Request Scraper & API
Extract pull requests, reviews, comments, commits, and merge status from any GitHub repository. Monitor PR activity, analyze code review metrics, and track contribution statistics with this fast, production-ready Playwright scraper.
π Key Features
- Comprehensive PR Data: Extract titles, states, authors, timestamps, labels, and more
- Detailed Statistics: Get commits count, changed files, additions/deletions, and review metrics
- Flexible Filtering: Filter by PR state (open, closed, merged) and limit results
- Fast & Reliable: Built with Playwright for stable, production-ready scraping
- Export Ready: Output to JSON or CSV for analytics, dashboards, and integrations
- No Authentication Required: Scrape public repositories without GitHub API tokens
π Use Cases
- DevOps Analytics: Track PR velocity, review times, and team productivity
- Code Review Monitoring: Monitor PR activity and review patterns
- Open Source Insights: Analyze contribution patterns in OSS projects
- Team Metrics: Generate reports on code review efficiency
- Competitive Intelligence: Track development activity in competitor repositories
- Research & Analysis: Study PR trends and collaboration patterns
π― What Gets Scraped
Each pull request includes:
- Basic Info: Title, number, state (Open/Closed/Merged), URL
- Author Details: Username and profile URL
- Timestamps: Created, updated, merged, and closed dates
- Statistics: Comments count, commits count, changed files
- Code Changes: Lines added and deleted
- Metadata: Labels, reviewers, base/head branches
- Repository: Owner and repo name
π Input Configuration
{"repositoryUrl": "https://github.com/microsoft/vscode","state": "all","maxPRs": 50,"includeDetails": true}
Input Parameters
- repositoryUrl (required): Full GitHub repository URL
- state (optional): Filter PRs by state -
all,open, orclosed(default:all) - maxPRs (optional): Maximum number of PRs to scrape, 1-500 (default:
50) - includeDetails (optional): Include detailed stats like commits and file changes (default:
true)
π€ Output Format
{"title": "Add support for TypeScript 5.0","number": 12345,"state": "Merged","author": "username","author_url": "https://github.com/username","created_at": "2024-01-15T10:30:00Z","merged_at": "2024-01-20T14:45:00Z","comments_count": 15,"commits_count": 8,"changed_files": 23,"additions": 456,"deletions": 123,"labels": ["enhancement", "typescript"],"reviewers": ["reviewer1", "reviewer2"],"url": "https://github.com/microsoft/vscode/pull/12345","repository": "microsoft/vscode","base_branch": "main","head_branch": "feature/ts5-support"}
π§ How to Use
- Create a free Apify account at apify.com
- Search for "GitHub Pull Request Scraper" in Apify Store
- Configure input: Add repository URL and optional filters
- Click Start and wait for results
- Export data: Download as JSON, CSV, or integrate via API
β‘ Performance
- Scrapes 50 PRs in ~2-3 minutes
- Handles repositories with thousands of PRs
- Optimized for speed with Playwright
- Automatic retry on network errors
π οΈ Technical Details
- Runtime: Python 3.11 with Playwright
- Browser: Chromium (headless)
- Rate Limiting: Respectful scraping with delays
- Error Handling: Robust error recovery and logging
π‘ Pro Tips
- Set
includeDetails: falsefor faster scraping if you don't need commit/file stats - Use
state: "open"to monitor active PRs only - Increase
maxPRsfor comprehensive historical analysis - Schedule regular runs to track PR trends over time
π Integration Examples
Slack Notifications
Monitor new PRs and send alerts to your team channel
Analytics Dashboards
Feed PR data into Tableau, PowerBI, or custom dashboards
CI/CD Pipelines
Trigger workflows based on PR activity
Research Projects
Analyze OSS development patterns and collaboration
π Privacy & Compliance
- Only scrapes publicly available data
- No authentication or API tokens required
- Respects GitHub's robots.txt
- Compliant with GitHub's Terms of Service for public data
π Support
Need help? Have questions?
π License
This actor is available under the Apache 2.0 license.
Built with β€οΈ using Apify and Playwright
Keywords: GitHub scraper, pull request scraper, PR analytics, code review metrics, GitHub API alternative, DevOps tools, repository analytics, open source insights, contribution tracking, GitHub data extraction
