Guru.com Scraper avatar
Guru.com Scraper

Pricing

Pay per usage

Go to Apify Store
Guru.com Scraper

Guru.com Scraper

Unlock Guru.com data instantly! Scrape detailed user profiles and job listings with ease. Perfect for recruitment, lead generation, and market analysis. Get essential data like freelancer skills, rates, and active projects to automate your workflow efficiently.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

11 days ago

Last modified

Share

Guru Jobs Scraper

This Apify actor scrapes freelance job listings from Guru.com using Crawlee's PlaywrightCrawler to bypass bot protection and extract comprehensive job information.

Features

  • Bot Protection Bypass: Uses PlaywrightCrawler with residential proxies to bypass Guru.com's Incapsula protection
  • Advanced Filtering: Filter by skill, budget range, job type (Fixed/Hourly), location, and keywords
  • Comprehensive Data Extraction: Extracts job details, budget information, employer data, and categories
  • Guru-Specific Fields: Captures budget ranges, quote counts, deadlines, featured status, and employer spending
  • Smart Pagination: Handles Guru.com's pagination system automatically
  • Structured Data: Prefers JSON-LD structured data where available, falls back to robust HTML parsing
  • Rate Limiting: Implements proper delays and proxy rotation to avoid detection

Input

The actor accepts the following input fields (all optional unless noted):

Basic Filters

  • keyword (string) — Job title or skill to search for (e.g., "Web Developer", "Data Scientist").
  • location (string) — Location filter for job postings.
  • skill (string) — Specific skill to filter by (e.g., "web-development", "javascript", "wordpress").

Advanced Filters

  • budget_min (integer) — Minimum budget amount filter (in USD).
  • budget_max (integer) — Maximum budget amount filter (in USD).
  • job_type (enum) — Filter by job type: "fixed", "hourly", or "both". Default: "both".

Configuration

  • startUrl / url / startUrls — Specific Guru.com job URL(s) to start from. If provided, these override other filters.
  • results_wanted (integer) — Maximum number of jobs to collect. Default: 100.
  • max_pages (integer) — Safety cap on number of listing pages to visit. Default: 20.
  • collectDetails (boolean) — If true, visits each job detail page for full information. Default: true.
  • proxyConfiguration — Proxy settings (residential proxies recommended for Guru.com).

Output

Each job saved to the dataset follows this structure:

{
"title": "Web Developer Needed",
"budget": "$1k-$2.5k",
"job_type": "Fixed Price",
"location": "United States",
"deadline": "Send before Feb 23, 2026",
"quote_count": 3,
"is_featured": false,
"categories": {
"main": "Programming & Development",
"sub": "Web Development & Design",
"skills": ["JavaScript", "HTML", "CSS"]
},
"employer": {
"name": "John Doe",
"totalSpent": "$2,768",
"feedbackPercentage": 100
},
"date_posted": "Posted 1 hr ago",
"description_html": "<p>Job description here...</p>",
"description_text": "Plain text version of description",
"url": "https://www.guru.com/d/jobs/12345/",
"scraped_at": "2026-01-25T12:00:00.000Z"
}

Usage Examples

{
"keyword": "react developer",
"results_wanted": 50
}

Advanced Filtering

{
"skill": "web-development",
"budget_min": 1000,
"budget_max": 5000,
"job_type": "hourly",
"location": "United States",
"results_wanted": 25
}

Specific Skill Page

{
"skill": "javascript",
"collectDetails": true,
"results_wanted": 100
}

Technical Notes

Bot Protection

  • Guru.com uses Incapsula bot protection requiring browser automation
  • Residential proxies are strongly recommended
  • Lower concurrency (3) is used to avoid detection

Performance

  • PlaywrightCrawler is slower than CheerioCrawler but necessary for Guru.com
  • Extended wait times ensure dynamic content loads properly
  • Memory usage is optimized for large datasets

Supported URL Patterns

  • Main jobs page: https://www.guru.com/d/jobs/
  • Skill-based pages: https://www.guru.com/m/find/freelance-jobs/[skill]/
  • Category pages: https://www.guru.com/d/jobs/c/[category]/

Error Handling

  • Robust retry strategies with exponential backoff
  • Graceful degradation when data extraction fails
  • Comprehensive logging for debugging

Dependencies

  • playwright - Browser automation for bot protection bypass
  • crawlee - Web scraping framework
  • apify - Apify platform integration
  • cheerio - HTML parsing and data extraction

Important Notes

  • Always use residential proxies for best results
  • Set reasonable results_wanted and max_pages to avoid rate limits
  • The scraper may take longer due to browser automation requirements
  • Guru.com's markup may change, requiring selector updates in src/main.js

Troubleshooting

Common Issues

  1. Bot Detection: Ensure residential proxies are configured
  2. Slow Performance: Lower max_pages and results_wanted values
  3. Empty Results: Check if Guru.com has changed their HTML structure
  4. Memory Issues: Reduce concurrency and increase system resources