
GithubScraper
Under maintenance
Pricing
$20.00/month + usage

GithubScraper
Under maintenance
Automatically scrapes and downloads Markdown documentation from GitHub repositories, for easy AI finetuning.
0.0 (0)
Pricing
$20.00/month + usage
2
Monthly users
1
Last modified
a year ago
GitHub Markdown Documentation Downloader
This actor is designed to aggregate .md
and .mdx
files containing Markdown documentation from specified GitHub repositories. It navigates through the repository's file structure and downloads the files, which are useful for training or finetuning models.
Features
- Downloads
.md
and.mdx
files from GitHub repositories. - Utilizes KeyValueStore to maintain coherence across concurrent executions.
- Ensures documentation coherence by avoiding downloads from commits and other branches.
Usage
Set the startUrl
to the home directory of the docs folder in the GitHub repository and run the actor.
Input Parameters
startUrl
: The starting URL of the GitHub repository's documentation directory.globPattern
: Glob pattern to match files within the repository. Defaults to '**/*.{md,mdx}'.maxConcurrency
: The maximum number of requests processed concurrently. Default is 1000.maxRequestsPerMinute
: The maximum number of requests made per minute. Default is 600.minConcurrency
: The minimum number of concurrent requests during execution. Default is 5.desiredConcurrency
: The initially desired number of concurrent requests. Default is 15.
Output
The actor outputs each Markdown file's content into the default dataset. Each entry contains the file name and content.
Example Input
1{ 2 "startUrl": "https://github.com/apify/apify-docs/tree/master", 3 "globPattern": "**/*.mdx", 4 "crawlerOptions": { 5 "maxConcurrency": 10 6 } 7}
Support
For support, contact info@fornace.it.
Pricing
Pricing model
RentalTo use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.
Free trial
10 minutes
Price
$20.00