Github Pull Request Scraper Api avatar
Github Pull Request Scraper Api
Under maintenance

Pricing

$4.99/month + usage

Go to Apify Store
Github Pull Request Scraper Api

Github Pull Request Scraper Api

Under maintenance

Extract GitHub pull requests with commits, reviews, comments & merge data. Monitor PR velocity, track code review metrics, analyze team productivity. Export to JSON/CSV for DevOps analytics, CI/CD automation & reporting. No API token needed. Fast Playwright scraper for developers & managers.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

Brennan Crawford

Brennan Crawford

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

4 days ago

Last modified

Share

GitHub Pull Request Scraper & API

Extract pull requests, reviews, comments, commits, and merge status from any GitHub repository. Monitor PR activity, analyze code review metrics, and track contribution statistics with this fast, production-ready Playwright scraper.

πŸš€ Key Features

  • Comprehensive PR Data: Extract titles, states, authors, timestamps, labels, and more
  • Detailed Statistics: Get commits count, changed files, additions/deletions, and review metrics
  • Flexible Filtering: Filter by PR state (open, closed, merged) and limit results
  • Fast & Reliable: Built with Playwright for stable, production-ready scraping
  • Export Ready: Output to JSON or CSV for analytics, dashboards, and integrations
  • No Authentication Required: Scrape public repositories without GitHub API tokens

πŸ“Š Use Cases

  • DevOps Analytics: Track PR velocity, review times, and team productivity
  • Code Review Monitoring: Monitor PR activity and review patterns
  • Open Source Insights: Analyze contribution patterns in OSS projects
  • Team Metrics: Generate reports on code review efficiency
  • Competitive Intelligence: Track development activity in competitor repositories
  • Research & Analysis: Study PR trends and collaboration patterns

🎯 What Gets Scraped

Each pull request includes:

  • Basic Info: Title, number, state (Open/Closed/Merged), URL
  • Author Details: Username and profile URL
  • Timestamps: Created, updated, merged, and closed dates
  • Statistics: Comments count, commits count, changed files
  • Code Changes: Lines added and deleted
  • Metadata: Labels, reviewers, base/head branches
  • Repository: Owner and repo name

πŸ“ Input Configuration

{
"repositoryUrl": "https://github.com/microsoft/vscode",
"state": "all",
"maxPRs": 50,
"includeDetails": true
}

Input Parameters

  • repositoryUrl (required): Full GitHub repository URL
  • state (optional): Filter PRs by state - all, open, or closed (default: all)
  • maxPRs (optional): Maximum number of PRs to scrape, 1-500 (default: 50)
  • includeDetails (optional): Include detailed stats like commits and file changes (default: true)

πŸ“€ Output Format

{
"title": "Add support for TypeScript 5.0",
"number": 12345,
"state": "Merged",
"author": "username",
"author_url": "https://github.com/username",
"created_at": "2024-01-15T10:30:00Z",
"merged_at": "2024-01-20T14:45:00Z",
"comments_count": 15,
"commits_count": 8,
"changed_files": 23,
"additions": 456,
"deletions": 123,
"labels": ["enhancement", "typescript"],
"reviewers": ["reviewer1", "reviewer2"],
"url": "https://github.com/microsoft/vscode/pull/12345",
"repository": "microsoft/vscode",
"base_branch": "main",
"head_branch": "feature/ts5-support"
}

πŸ”§ How to Use

  1. Create a free Apify account at apify.com
  2. Search for "GitHub Pull Request Scraper" in Apify Store
  3. Configure input: Add repository URL and optional filters
  4. Click Start and wait for results
  5. Export data: Download as JSON, CSV, or integrate via API

⚑ Performance

  • Scrapes 50 PRs in ~2-3 minutes
  • Handles repositories with thousands of PRs
  • Optimized for speed with Playwright
  • Automatic retry on network errors

πŸ› οΈ Technical Details

  • Runtime: Python 3.11 with Playwright
  • Browser: Chromium (headless)
  • Rate Limiting: Respectful scraping with delays
  • Error Handling: Robust error recovery and logging

πŸ’‘ Pro Tips

  • Set includeDetails: false for faster scraping if you don't need commit/file stats
  • Use state: "open" to monitor active PRs only
  • Increase maxPRs for comprehensive historical analysis
  • Schedule regular runs to track PR trends over time

πŸ“Š Integration Examples

Slack Notifications

Monitor new PRs and send alerts to your team channel

Analytics Dashboards

Feed PR data into Tableau, PowerBI, or custom dashboards

CI/CD Pipelines

Trigger workflows based on PR activity

Research Projects

Analyze OSS development patterns and collaboration

πŸ”’ Privacy & Compliance

  • Only scrapes publicly available data
  • No authentication or API tokens required
  • Respects GitHub's robots.txt
  • Compliant with GitHub's Terms of Service for public data

πŸ†˜ Support

Need help? Have questions?

πŸ“œ License

This actor is available under the Apache 2.0 license.


Built with ❀️ using Apify and Playwright

Keywords: GitHub scraper, pull request scraper, PR analytics, code review metrics, GitHub API alternative, DevOps tools, repository analytics, open source insights, contribution tracking, GitHub data extraction