Pricing

from $0.02 / 1,000 results

Skill Curator Scraper

MCP Skill Scraper collects AI skills from SkillsMP and GitHub. It extracts name, description, stars, license, and URLs, then calculates a quality score. Outputs structured JSON for discovering MCP tools, AI skills, and developer resources.

Pricing

from $0.02 / 1,000 results

Rating

0.0

(0)

Developer

Data Pilot

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

📋 Table of Contents

Features
Sources
How It Works
Input
Output
Quality Scoring
Technical Stack
Data Fields
Use Cases
Quick Start
Configuration
Performance
Important Notes
Keywords
Changelog
Support

🔥 Features

Multi-Source Skill Discovery – Aggregates Skill Curator resources from SkillsMP and GitHub simultaneously using parallel requests.
SkillsMP Scraping – HTML scraping of SkillsMP platform for Skill Curator content discovery.
GitHub Repository Search – GitHub API integration for discovering skill-related repositories and resources.
Quality Scoring – Intelligent algorithm scoring skills based on stars, licensing, description quality, and availability.
Duplicate Detection – Automatic deduplication of skills from multiple sources.
Star-Based Ranking – Prioritizes popular repositories and well-maintained projects.
License Information – Extracts and includes SPDX license identifiers for legal compliance.
Author Attribution – Captures author/owner information from repositories.
Keyword-Based Search – Supports multiple keywords for comprehensive skill discovery.
Bulk Keyword Processing – Analyzes multiple skill keywords simultaneously.
Rate Limiting – Includes automatic delays to respect API rate limits.
Proxy Support – Apify residential proxy support for reliable access.
Real-Time Dataset Push – Pushes results to Apify Dataset with metadata.
Timestamp Recording – Records discovery timestamp for audit trails.
Error Handling – Graceful error handling with detailed logging.
Asyncio-Friendly – Non-blocking async/await architecture.

🌍 Sources

1. SkillsMP

Platform: Skill marketplace and curator platform
Search Type: HTML scraping
Content: Skill cards, descriptions, skill URLs
Data Extracted: Name, description, skill URL
URL Format: https://skillsmp.com/skills/{name}
Coverage: Broad skill marketplace

2. GitHub Repositories

Platform: GitHub version control and open source
Search Type: REST API (JSON)
Content: Repositories, code projects, implementations
Data Extracted: Name, description, stars, license, author, GitHub URL
API Endpoint: https://api.github.com/search/repositories
Search Query: Keywords + "mcp" + "skill"
Sorting: By stars (most popular first)

⚙️ How It Works

The Skill Curator Scraper takes skill keywords as input and searches multiple sources simultaneously. It scrapes SkillsMP for skill cards and queries GitHub API for repositories. Each skill is assigned a quality score based on stars, licensing, description quality, and availability. Results are deduplicated and pushed to the Apify Dataset.

Key Processing Steps:

Input Parsing – Accept skill keywords from Actor input
Proxy Setup – Configure Apify residential proxy if available
Parallel Source Queries – Launch SkillsMP scraping and GitHub API search
SkillsMP Scraping – HTML parse skill cards from SkillsMP
GitHub API Search – Query GitHub with keyword filters
Data Extraction – Extract name, description, stars, license, author
Quality Scoring – Calculate quality score for each skill
Deduplication – Remove duplicate entries from multiple sources
Result Compilation – Aggregate findings from all sources
Dataset Push – Push to Apify Dataset with metadata

Key Benefits:

Discover Skill Curator resources from multiple trusted sources
Find popular and well-maintained skill implementations
Compare skills across SkillsMP and GitHub
Identify high-quality learning resources
Build comprehensive skill inventories
Research skill implementations and examples

📥 Input

The Actor accepts the following input parameters:

Field	Type	Default	Description
`keywords`	array	required	Skill keywords to search (e.g., ["React", "Python", "DevOps"])
`limit_per_keyword`	integer	`20`	Maximum skills per keyword (1-100)
`proxyConfiguration`	object	`{"useApifyProxy": true}`	Proxy configuration settings

Example Input:

{
  "keywords": ["React", "Python", "DevOps", "GraphQL", "Kubernetes"],
  "limit_per_keyword": 25,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

Single Keyword Example:

{
  "keywords": ["Machine Learning"],
  "limit_per_keyword": 30
}

📤 Output

The Actor pushes Skill Curator records with the following structure:

Field	Type	Description
`name`	string	Skill or repository name
`description`	string	Skill/repo description (max 300 chars)
`author`	string	Repository owner/author name
`repoMetadata.stars`	integer	GitHub stars count
`repoMetadata.license`	string	SPDX license identifier
`githubUrl`	string	Direct GitHub repository URL
`skillUrl`	string	SkillsMP or project skill URL
`qualityScore.overall`	integer	Quality score (0-100)
`keyword`	string	Search keyword used
`detected_at`	string	ISO 8601 discovery timestamp

Example Output Record (GitHub):

{
  "name": "react-query",
  "author": "tannerlinsley",
  "description": "Powerful asynchronous state management, server-state utilities and data fetching with TS/JS, React Query, Solid Query, Svelte Query and Vue Query.",
  "repoMetadata.stars": 42000,
  "repoMetadata.license": "MIT",
  "githubUrl": "https://github.com/tannerlinsley/react-query",
  "skillUrl": "https://skillsmp.com/skills/react-query",
  "qualityScore.overall": 95,
  "keyword": "React",
  "detected_at": "2025-02-14T12:00:00Z"
}

Example Output Record (SkillsMP):

{
  "name": "Advanced React Patterns",
  "description": "Learn advanced React patterns including render props, custom hooks, and compound components for building scalable applications.",
  "repoMetadata.stars": 0,
  "githubUrl": "",
  "skillUrl": "https://skillsmp.com/skills/advanced-react-patterns",
  "qualityScore.overall": 65,
  "keyword": "React",
  "detected_at": "2025-02-14T12:00:00Z"
}

🎯 Quality Scoring

The Skill Curator Scraper uses an intelligent quality scoring algorithm to rank skills:

Scoring Criteria

Factor	Points	Threshold
GitHub Stars	40	≥100 stars = 40, ≥50 = 30, ≥10 = 15
License	10	Has SPDX license
Description	15	Description > 100 characters
GitHub URL	10	Repository URL available
Total	100	Maximum score

Scoring Examples

High-Quality Repository (100+ stars + license + good description):
- Stars (100+): 40 points
- License: 10 points
- Description (>100 chars): 15 points
- GitHub URL: 10 points
- Total: 75/100 (Good)

Popular Repository (1000+ stars + license + excellent description):
- Stars (100+): 40 points (capped at 40)
- License: 10 points
- Description: 15 points
- GitHub URL: 10 points
- Total: 75/100

SkillsMP Card (No GitHub, good description):
- Description: 15 points
- Total: 15/100 (Low)

Score Interpretation

90-100: Excellent (Popular, well-maintained, licensed)
70-89: Good (Solid project with community support)
50-69: Fair (Emerging projects, niche tools)
30-49: Basic (Limited info or new projects)
0-29: Limited (Minimal metadata, research phase)

🧰 Technical Stack

HTTP Requests: requests library with asyncio executor
HTML Parsing: BeautifulSoup4 for SkillsMP scraping
APIs: GitHub REST API v3 (JSON)
Async: asyncio for concurrent requests
Pattern Matching: Python regex for text cleaning
Proxy: Apify Proxy with residential support
Logging: Apify Actor logging system
Platform: Apify Actor serverless environment
Timeout: 20 seconds per request

🎯 Use Cases

Skill Inventory Building – Create comprehensive Skill Curator inventories
Learning Resource Curation – Discover quality learning materials
Technology Research – Research popular skill implementations
Competitive Analysis – Compare skills across platforms
Professional Development – Find resources for skill enhancement
Project Reference – Discover implementation examples
Technology Stack Planning – Evaluate skill options
Team Skill Assessment – Identify skill gaps and opportunities
Startup Research – Discover emerging skills and tools
Education Planning – Build curriculum with curated resources
Vendor Evaluation – Assess skill availability and quality
Open Source Discovery – Find high-quality open source projects
Technology Benchmarking – Compare skills across metrics
Knowledge Management – Build skill knowledge bases
Job Market Analysis – Research in-demand skills

Limit Configuration

Balanced (20 per keyword):

{
  "limit_per_keyword": 20
}

Comprehensive (50 per keyword):

{
  "limit_per_keyword": 50
}

📦 Changelog

Initial Release:

Multi-source skill discovery (SkillsMP + GitHub)
SkillsMP HTML scraping for skill cards
GitHub API integration for repositories
Quality scoring algorithm (0-100 scale)
Star-based popularity ranking
License information extraction
Duplicate detection across sources
Author/owner attribution
Bulk keyword processing
Keyword-based search capability
Rate limiting (1 second between keywords)
Apify proxy support
Asyncio executor for non-blocking requests
Real-time Dataset push
ISO 8601 timestamp recording
Error handling and logging

Disclaimer: Skill Curator Scraper is provided as-is for skill discovery purposes. Users are responsible for ensuring compliance with platform ToS and laws. Always respect original authors and licenses.

🎉 Get Started Today

Deploy now for skill discovery!

Use for:

📚 Learning Resource Curation
🔍 Skill Research
💡 Technology Intelligence
📋 Skill Inventory
🎯 Professional Development

Perfect for:

Learning Platforms
Career Coaches
Educators
Researchers
Product Managers

Smart Article Extractor
Business Social Media Finder
Fast News Content Scraper
Startup Company Data Collector

Your complete Apify-powered skill discovery solution! 🚀✨

🎓 Skill Discovery Excellence

This Actor is optimized for Skill Curator discovery with:

✅ Multi-source aggregation
✅ Intelligent quality scoring
✅ GitHub API integration
✅ SkillsMP scraping
✅ Duplicate detection
✅ Real-time Dataset integration
✅ Error recovery
✅ Production-ready code

Discover and curate skills effortlessly! 💎🚀

Agent Skills Scraper

parsebird/agent-skills-scraper

Extract deep metadata from skills.sh, the open agent skills directory. Scrape weekly installs, GitHub stars, security audits, agent adoption breakdown, SKILL.md content, and more from every skill listing.

ParseBird

ClawHub Skill Scraper

jungle_synthesizer/clawhub-skill-content-scraper

Scrape AI agent skills from the ClawHub marketplace. Extracts SKILL.md definitions, metadata, stats, and changelogs for 8,000+ OpenClaw skills.

BowTiedRaccoon

Anything To Skill

straightforward_understanding/anything-to-skill

Transform YouTube videos and websites into AI agent skills that actually work.

Yann Feunteun

MCP Server Catalog + Quality Score

ianymu/mcp-server-catalog

mcp-server-catalog is an Apify Actor that scrapes the top awesome-mcp-server GitHub lists, scores every MCP (Model Context Protocol) server on six quality dimensions (stars, recency, license, description, docs, activity), and returns a ranked dataset of production-ready MCP servers.

Yanlong Mu

Skill Curator

ai_crew_solutions/skill-curator-v3

Search 117,000+ Claude Code Skills and get quality-scored recommendations. Each result is enriched with live GitHub metadata and scored across 4 dimensions: popularity, freshness, documentation quality, and license compliance. Pay only for what you use — $0.05 per search + $0.01 per result.

AI Crew Solutions

3.2

Hermes Skill Builder for Apify & APIs

solutionssmart/hermes-skill-builder-for-apify-apis

Generate Hermes-ready AI agent skill packages from Apify Actors, API docs, GitHub repositories, and OpenAPI specs.

Solutions Smart

Claude Skill Changes Tracker

ianymu/claude-skill-changes-tracker

Daily diff watcher for Claude Code SKILL.md files across a curated list of GitHub repos. Detects new, updated, and removed Skills by comparing each run's manifest against a cached baseline in the Actor's key-value store.

Yanlong Mu

🔧 Dev Tools MCP — AI Code & Package Search

nexgendata/developer-tools-mcp-server

MCP server for AI agents to search GitHub, npm, PyPI, StackOverflow & ArXiv. Connect Claude, GPT or any AI to dev ecosystems. 7 tools for developer intelligence.

Stephan Corbeil

MCP Server: Github

dltik/mcp-server-github

MCP Server: Github: an MCP server exposing 6 tools for AI agents. HTTP-only, no API key. Pay $0.01/tool-call.

dltik

Talent Scout

humble_apron/talent-scout

AI-powered technical recruiter. Scrapes and ranks developer profiles from GitHub & LinkedIn to match your specific job requirements with smart skill scoring.