# GitHub Repository Search & Scraper (`scrapeworks/github-repo-search`) Actor

Search GitHub repositories by keyword, language, topic, stars, and date. Clean structured JSON with stars, forks, license, topics, owner, and activity dates. Optional token for high rate limits.

- **URL**: https://apify.com/scrapeworks/github-repo-search.md
- **Developed by:** [Nicolas van Arkens](https://apify.com/scrapeworks) (community)
- **Categories:** Automation, Other, Open source
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## GitHub Repository Search & Scraper 🐙

Search **GitHub repositories** at scale and get clean, structured JSON — stars, forks, language, topics, license, owner, and activity dates. Powered by the official GitHub REST API, so the data is accurate and the actor is fast and reliable.

Filter by keyword, language, topic, minimum stars, owner, and date, then sort by stars, forks, or recent activity. Perfect for **developer-tool market research, competitor and dependency tracking, finding trending or actively-maintained projects, lead generation, and building datasets** for analysis or AI.

### What you can do

- 🔎 **Search** repositories by free-text keywords across names, descriptions, and READMEs
- 🧰 **Filter** by language, topic(s), minimum stars, owner/org, creation date, and last-push date
- 📈 **Find actively-maintained projects** with the "pushed after" filter, or **new/trending ones** with "created after"
- 🚫 **Exclude** forks and archived repos for a clean signal
- ↕️ **Sort** by stars, forks, recently updated, or best-match relevance
- 🧹 **Clean output** — one tidy record per repo, ready for CSV/Excel/JSON or the API

### Example use cases

- **Competitive & market research:** list every Python web-scraping library over 50 stars, sorted by activity.
- **Dependency / ecosystem tracking:** monitor all repos under an org and their last-push dates.
- **Find trending projects:** repos created in the last 90 days, sorted by stars.
- **Lead generation:** discover maintainers and organizations active in a topic.
- **Datasets for AI/analysis:** pull thousands of repos with structured metadata.

### Input

| Field | Description |
|-------|-------------|
| **Search query** | Free-text keywords. |
| **Language** | Restrict to a language (e.g. `python`). |
| **Topic(s)** | One or more GitHub topics, comma-separated. |
| **User / organization** | Restrict to one owner. |
| **Minimum stars** | Star threshold. |
| **Pushed after / Created after** | Date filters (YYYY-MM-DD). |
| **Exclude forks / archived** | Clean up results. |
| **Sort / Order** | Stars, forks, updated, or best-match. |
| **Maximum repositories** | Up to 1000 per search (GitHub's cap). |
| **GitHub token** (optional) | Strongly recommended — raises rate limits massively. A token with **no scopes** (public data only) is enough. |

### Output

```json
{
  "fullName": "scrapy/scrapy",
  "name": "scrapy",
  "description": "Scrapy, a fast high-level web crawling & scraping framework for Python.",
  "url": "https://github.com/scrapy/scrapy",
  "homepage": "https://scrapy.org",
  "owner": "scrapy",
  "ownerType": "Organization",
  "stars": 61942,
  "forks": 11585,
  "watchers": 61942,
  "openIssues": 625,
  "language": "Python",
  "topics": ["crawler", "scraping", "web-scraping"],
  "license": "BSD-3-Clause",
  "isFork": false,
  "isArchived": false,
  "createdAt": "2010-02-22T02:01:14Z",
  "updatedAt": "2026-05-28T13:32:56Z",
  "pushedAt": "2026-05-20T08:27:24Z"
}
````

Export to JSON, CSV, or Excel, or pull via the Apify API. Connect to Google Sheets, Slack, Zapier, or Make.

### About the GitHub token

The actor works without a token, but GitHub limits unauthenticated search to roughly 10 requests/minute. Adding a free **personal access token** (classic, no scopes needed for public data) raises this to ~30 search requests/minute and 5000 requests/hour — recommended for anything beyond a quick test. Your token is sent only to GitHub and is never stored.

### Notes

- Accesses only publicly available GitHub data via the official REST API. Independent tool, not affiliated with GitHub.
- GitHub's Search API caps any single query at 1000 returned results; narrow with filters or date windows for larger coverage.

# Actor input Schema

## `query` (type: `string`):

Free-text keywords to search repository names, descriptions, and READMEs. Combine with the filters below. Leave empty if you only want to filter by language/topic/user.

## `language` (type: `string`):

Restrict to a programming language, e.g. 'python', 'rust', 'typescript'.

## `topic` (type: `string`):

One or more GitHub topics, comma-separated (e.g. 'cli,automation'). Repos must have all listed topics.

## `user` (type: `string`):

Restrict to repositories owned by this user/org login (e.g. 'apify').

## `minStars` (type: `integer`):

Only return repositories with at least this many stars.

## `pushedAfter` (type: `string`):

Only repos with a commit pushed on/after this date (YYYY-MM-DD). Great for finding actively-maintained projects.

## `createdAfter` (type: `string`):

Only repos created on/after this date (YYYY-MM-DD). Useful for finding new/trending projects.

## `excludeForks` (type: `boolean`):

Skip forked repositories.

## `excludeArchived` (type: `boolean`):

Skip archived (read-only) repositories.

## `sort` (type: `string`):

How to sort results. Stars/Forks rank by popularity; Recently updated surfaces active repos; Best match uses GitHub's relevance ranking.

## `order` (type: `string`):

Sort direction: descending (highest first) or ascending (lowest first).

## `maxItems` (type: `integer`):

How many repositories to return (GitHub caps any single search at 1000 results).

## `githubToken` (type: `string`):

A GitHub personal access token. Optional but strongly recommended: raises the rate limit from ~10 to ~30 search requests/min and 5000 req/hr. A classic token with NO scopes (public data only) is enough. Your token is used only to call GitHub and is never stored.

## Actor input object example

```json
{
  "query": "web scraping",
  "excludeForks": false,
  "excludeArchived": false,
  "sort": "stars",
  "order": "desc",
  "maxItems": 100
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "web scraping"
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapeworks/github-repo-search").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "query": "web scraping" }

# Run the Actor and wait for it to finish
run = client.actor("scrapeworks/github-repo-search").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "web scraping"
}' |
apify call scrapeworks/github-repo-search --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapeworks/github-repo-search",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "GitHub Repository Search & Scraper",
        "description": "Search GitHub repositories by keyword, language, topic, stars, and date. Clean structured JSON with stars, forks, license, topics, owner, and activity dates. Optional token for high rate limits.",
        "version": "0.1",
        "x-build-id": "TRPeUgAYaS6z2EcaR"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapeworks~github-repo-search/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapeworks-github-repo-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapeworks~github-repo-search/runs": {
            "post": {
                "operationId": "runs-sync-scrapeworks-github-repo-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapeworks~github-repo-search/run-sync": {
            "post": {
                "operationId": "run-sync-scrapeworks-github-repo-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Free-text keywords to search repository names, descriptions, and READMEs. Combine with the filters below. Leave empty if you only want to filter by language/topic/user."
                    },
                    "language": {
                        "title": "Language",
                        "type": "string",
                        "description": "Restrict to a programming language, e.g. 'python', 'rust', 'typescript'."
                    },
                    "topic": {
                        "title": "Topic(s)",
                        "type": "string",
                        "description": "One or more GitHub topics, comma-separated (e.g. 'cli,automation'). Repos must have all listed topics."
                    },
                    "user": {
                        "title": "User or organization",
                        "type": "string",
                        "description": "Restrict to repositories owned by this user/org login (e.g. 'apify')."
                    },
                    "minStars": {
                        "title": "Minimum stars",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Only return repositories with at least this many stars."
                    },
                    "pushedAfter": {
                        "title": "Pushed after",
                        "type": "string",
                        "description": "Only repos with a commit pushed on/after this date (YYYY-MM-DD). Great for finding actively-maintained projects."
                    },
                    "createdAfter": {
                        "title": "Created after",
                        "type": "string",
                        "description": "Only repos created on/after this date (YYYY-MM-DD). Useful for finding new/trending projects."
                    },
                    "excludeForks": {
                        "title": "Exclude forks",
                        "type": "boolean",
                        "description": "Skip forked repositories.",
                        "default": false
                    },
                    "excludeArchived": {
                        "title": "Exclude archived",
                        "type": "boolean",
                        "description": "Skip archived (read-only) repositories.",
                        "default": false
                    },
                    "sort": {
                        "title": "Sort by",
                        "enum": [
                            "stars",
                            "forks",
                            "updated",
                            "best-match"
                        ],
                        "type": "string",
                        "description": "How to sort results. Stars/Forks rank by popularity; Recently updated surfaces active repos; Best match uses GitHub's relevance ranking.",
                        "default": "stars"
                    },
                    "order": {
                        "title": "Order",
                        "enum": [
                            "desc",
                            "asc"
                        ],
                        "type": "string",
                        "description": "Sort direction: descending (highest first) or ascending (lowest first).",
                        "default": "desc"
                    },
                    "maxItems": {
                        "title": "Maximum repositories",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "How many repositories to return (GitHub caps any single search at 1000 results).",
                        "default": 100
                    },
                    "githubToken": {
                        "title": "GitHub token (optional)",
                        "type": "string",
                        "description": "A GitHub personal access token. Optional but strongly recommended: raises the rate limit from ~10 to ~30 search requests/min and 5000 req/hr. A classic token with NO scopes (public data only) is enough. Your token is used only to call GitHub and is never stored."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
