Guru.com Scraper
Pricing
Pay per usage
Go to Apify Store

Guru.com Scraper
Unlock Guru.com data instantly! Scrape detailed user profiles and job listings with ease. Perfect for recruitment, lead generation, and market analysis. Get essential data like freelancer skills, rates, and active projects to automate your workflow efficiently.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Shahid Irfan
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
Guru Jobs Scraper
This Apify actor scrapes freelance job listings from Guru.com using Crawlee's PlaywrightCrawler to bypass bot protection and extract comprehensive job information.
Features
- Bot Protection Bypass: Uses PlaywrightCrawler with residential proxies to bypass Guru.com's Incapsula protection
- Advanced Filtering: Filter by skill, budget range, job type (Fixed/Hourly), location, and keywords
- Comprehensive Data Extraction: Extracts job details, budget information, employer data, and categories
- Guru-Specific Fields: Captures budget ranges, quote counts, deadlines, featured status, and employer spending
- Smart Pagination: Handles Guru.com's pagination system automatically
- Structured Data: Prefers JSON-LD structured data where available, falls back to robust HTML parsing
- Rate Limiting: Implements proper delays and proxy rotation to avoid detection
Input
The actor accepts the following input fields (all optional unless noted):
Basic Filters
keyword(string) — Job title or skill to search for (e.g., "Web Developer", "Data Scientist").location(string) — Location filter for job postings.skill(string) — Specific skill to filter by (e.g., "web-development", "javascript", "wordpress").
Advanced Filters
budget_min(integer) — Minimum budget amount filter (in USD).budget_max(integer) — Maximum budget amount filter (in USD).job_type(enum) — Filter by job type: "fixed", "hourly", or "both". Default: "both".
Configuration
startUrl/url/startUrls— Specific Guru.com job URL(s) to start from. If provided, these override other filters.results_wanted(integer) — Maximum number of jobs to collect. Default: 100.max_pages(integer) — Safety cap on number of listing pages to visit. Default: 20.collectDetails(boolean) — If true, visits each job detail page for full information. Default: true.proxyConfiguration— Proxy settings (residential proxies recommended for Guru.com).
Output
Each job saved to the dataset follows this structure:
{"title": "Web Developer Needed","budget": "$1k-$2.5k","job_type": "Fixed Price","location": "United States","deadline": "Send before Feb 23, 2026","quote_count": 3,"is_featured": false,"categories": {"main": "Programming & Development","sub": "Web Development & Design","skills": ["JavaScript", "HTML", "CSS"]},"employer": {"name": "John Doe","totalSpent": "$2,768","feedbackPercentage": 100},"date_posted": "Posted 1 hr ago","description_html": "<p>Job description here...</p>","description_text": "Plain text version of description","url": "https://www.guru.com/d/jobs/12345/","scraped_at": "2026-01-25T12:00:00.000Z"}
Usage Examples
Basic Search
{"keyword": "react developer","results_wanted": 50}
Advanced Filtering
{"skill": "web-development","budget_min": 1000,"budget_max": 5000,"job_type": "hourly","location": "United States","results_wanted": 25}
Specific Skill Page
{"skill": "javascript","collectDetails": true,"results_wanted": 100}
Technical Notes
Bot Protection
- Guru.com uses Incapsula bot protection requiring browser automation
- Residential proxies are strongly recommended
- Lower concurrency (3) is used to avoid detection
Performance
- PlaywrightCrawler is slower than CheerioCrawler but necessary for Guru.com
- Extended wait times ensure dynamic content loads properly
- Memory usage is optimized for large datasets
Supported URL Patterns
- Main jobs page:
https://www.guru.com/d/jobs/ - Skill-based pages:
https://www.guru.com/m/find/freelance-jobs/[skill]/ - Category pages:
https://www.guru.com/d/jobs/c/[category]/
Error Handling
- Robust retry strategies with exponential backoff
- Graceful degradation when data extraction fails
- Comprehensive logging for debugging
Dependencies
playwright- Browser automation for bot protection bypasscrawlee- Web scraping frameworkapify- Apify platform integrationcheerio- HTML parsing and data extraction
Important Notes
- Always use residential proxies for best results
- Set reasonable
results_wantedandmax_pagesto avoid rate limits - The scraper may take longer due to browser automation requirements
- Guru.com's markup may change, requiring selector updates in
src/main.js
Troubleshooting
Common Issues
- Bot Detection: Ensure residential proxies are configured
- Slow Performance: Lower
max_pagesandresults_wantedvalues - Empty Results: Check if Guru.com has changed their HTML structure
- Memory Issues: Reduce concurrency and increase system resources