CV-Library Jobs Scraper avatar

CV-Library Jobs Scraper

Pricing

from $3.99 / 1,000 results

Go to Apify Store
CV-Library Jobs Scraper

CV-Library Jobs Scraper

Extract job listings from CV-Library, one of the UK's leading independent job boards.

Pricing

from $3.99 / 1,000 results

Rating

0.0

(0)

Developer

Jobs Scraper

Jobs Scraper

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Overview

Probe CV-Library, one of the UK's largest independent job boards, for career opportunities across Britain. This actor extracts position details, recruiter information, and compensation data from CV-Library's extensive listing database, covering permanent, temporary, and contract roles throughout the United Kingdom.

Features

  • Independent UK job board with broad industry coverage
  • Recruiter and agency information extraction
  • Permanent, temporary, and contract role filtering
  • Distance-based search radius support
  • Proxy rotation with automatic fallback (residential → datacenter)
  • CAPTCHA detection and session rotation
  • Automatic retry on failures with exponential backoff
  • Deduplication of results by application URL
  • Dataset validation with auto-fix capability

Supported Inputs

FieldTypeDefaultDescription
keywordstring"software engineer"Search terms for job discovery
locationstring"London"Geographic filter for results
countrystring"GB"Country code for proxy routing
maxItemsinteger50Upper limit on extracted listings
proxyEnabledbooleantrueToggle proxy rotation on/off
sortBystring"relevance"Result ordering (relevance/date/salary)
jobTypestring""Employment type filter
experienceLevelstring""Seniority level filter
datePostedstring""Recency filter (24h/3d/7d/14d/30d)
remoteOnlybooleanfalseRestrict to remote positions only
includeCompanyDetailsbooleantrueFetch extra company information
includeSalarybooleantrueInclude compensation data

Output Format

Each scraped listing produces a JSON object with these fields:

{
"jobTitle": "Senior Software Engineer",
"companyName": "Example Corp",
"location": "London",
"salary": "$120,000 - $160,000",
"jobType": "Full-time",
"experienceLevel": "Senior",
"postedDate": "2 days ago",
"applyUrl": "https://www.cv-library.co.uk/job/12345",
"companyUrl": "https://www.cv-library.co.uk/company/example",
"description": "We are looking for a skilled engineer...",
"requirements": ["JavaScript", "Node.js", "React"],
"benefits": ["Health Insurance", "Remote Work"],
"sourcePortal": "CV-Library",
"country": "GB",
"scrapedAt": "2025-01-15T10:30:00.000Z"
}

Proxy Handling

The actor employs a multi-tier proxy strategy to maximize successful data extraction.

  1. Apify Residential Proxy (country-targeted) — First choice for CV-Library
  2. Apify Residential Proxy (any region) — Fallback if country proxy unavailable
  3. Apify Datacenter Proxy — Secondary fallback for cost efficiency
  4. Direct Connection — Last resort when all proxies fail

Proxies auto-rotate on each request. Blocked sessions are discarded and replaced automatically.

Retry Logic

Failed requests are retried up to 5 times with automatic session rotation.

  • Maximum 5 retries per request
  • Fresh browser session on each retry
  • Automatic proxy rotation between attempts
  • Blocked status codes (401, 403, 429) trigger session refresh
  • Configurable request timeout (120 seconds)

Anti-block Handling

The actor incorporates multiple stealth techniques to minimize detection.

  • navigator.webdriver property masked
  • Human-like delays between page interactions (2–5 seconds)
  • Browser language and plugin fingerprints normalised
  • Session pool with automatic rotation on blocks
  • CAPTCHA detection with graceful retry
  • Rate limit detection (HTTP 429) with backoff

Sample Input

{
"keyword": "data analyst",
"location": "London",
"maxItems": 25,
"proxyEnabled": true,
"sortBy": "date",
"remoteOnly": false
}

Sample Output

{
"jobTitle": "Data Analyst",
"companyName": "TechCorp International",
"location": "London",
"salary": "Competitive",
"jobType": "Full-time",
"experienceLevel": "Mid-level",
"postedDate": "1 day ago",
"applyUrl": "https://www.cv-library.co.uk/job/example-123",
"companyUrl": "",
"description": "Seeking a detail-oriented data analyst to join our growing team...",
"requirements": ["SQL", "Python", "Tableau"],
"benefits": ["Health Insurance", "Flexible Hours"],
"sourcePortal": "CV-Library",
"country": "GB",
"scrapedAt": "2025-01-15T14:22:00.000Z"
}

Usage

Local Development

# Install dependencies
npm install
# Set Apify token (required for proxy)
export APIFY_TOKEN=your_token_here
# Run the actor
npm start
# Validate scraped data
node dataset-validator.js

Apify Platform

# Login to Apify
apify login
# Push actor to platform
apify push
# Run from Apify Console or API

Deployment

  1. Ensure all dependencies are installed: npm install
  2. Authenticate with Apify: apify login
  3. Deploy the actor: apify push
  4. Configure input in the Apify Console
  5. Schedule runs or trigger via API / webhooks

Limitations

  • Results depend on the portal's current HTML structure; layout changes may require selector updates
  • Some job details (salary, benefits) may not be available for all listings
  • Rate limiting by the portal may reduce throughput during high-volume scrapes
  • CAPTCHA challenges may interrupt scraping on heavily protected pages
  • CV-Library may modify their anti-bot measures, requiring periodic updates
  • Maximum items per run is capped at 1000 to prevent excessive resource usage
  • Proxy costs apply when using Apify residential or datacenter proxies