Remote Jobs Intelligence Scraper
Pricing
from $1.80 / 1,000 job-results
Remote Jobs Intelligence Scraper
Scrape public remote job listings from remote-first sources (Remotive, Remote OK, We Work Remotely) and turn them into clean, CSV-ready hiring-intelligence data - no login, cookies, or residential proxy.
Pricing
from $1.80 / 1,000 job-results
Rating
0.0
(0)
Developer
Delowar Munna
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share

Collect public remote job listings from remote-first sources — Remotive, Remote OK, and We Work Remotely — and turn them into one clean, flat, CSV-ready schema enriched with lightweight remote-work intelligence: remote scope, location/country/timezone restrictions, salary availability, detected skills, and a transparent hiring-signal score. Built for recruiters, staffing agencies, sales teams, and remote-work market researchers.
No login, no cookies, no residential proxy, no paid APIs. The actor reads each source's public API/feed over plain HTTP, so it stays fast and cost-predictable. You pay one flat event per unique job row that passes your filters.
✨ Why this scraper
- Remote-first, not generic — only remote-job sources, normalized into a single schema with remote-specific intelligence fields.
- Multiple sources, one schema — Remotive + Remote OK JSON APIs and We Work Remotely RSS, deduplicated across sources.
- 31 flat fields — job identity, company, remote scope, salary, skills, posting age, and hiring signal. No nested objects; drops straight into Sheets/Excel/CRMs.
- Pay-Per-Event — one flat
job-resultevent per saved unique job. Duplicates and filtered rows are never charged. - Partial-failure safe — if one source is down, the others still return results.
- Transparent hiring-signal score — rule-based (no AI), explained below.
🚀 Quick start — sample inputs
Example 1 — keyword search across all three sources
{"keywords": ["software engineer", "python"],"keywordMatchMode": "any","sources": ["remotive", "remoteok", "weworkremotely"],"remoteScope": "any","salaryRequired": false,"postedWithinDays": 30,"includeDescription": true,"includeDetectedSkills": true,"maxResults": 500,"deduplicate": true,"proxyConfiguration": { "useApifyProxy": true }}
Keyword matching is word/token based —
"software engineer"also matches"Backend Engineer"or"Staff Engineer". UsekeywordMatchMode: "all"to require every keyword.
Example 2 — worldwide remote, salary required, all three sources + direct URL
{"keywords": ["designer"],"sources": ["remotive", "remoteok", "weworkremotely"],"sourceUrls": ["https://remotive.com/remote-jobs/design"],"locationKeywords": ["Worldwide", "Europe"],"remoteScope": "worldwide","salaryRequired": true,"postedWithinDays": 14,"maxResults": 500,"proxyConfiguration": { "useApifyProxy": true }}
Leave
keywords,sourceUrls, and the filters empty to simply pull the most recent jobs from the selected sources. If bothsourcesandsourceUrlsare given, the actor runs both and deduplicates across the whole run.
Attribution: Remote OK requires that data consumers credit Remote OK as the source. If you republish Remote OK rows, link back to the original
source_job_url.
📦 Output
The dataset has one view: Remote jobs & intelligence — a 31-column flat table.

Output fields (31)
job_id, source, source_job_url, canonical_job_url, job_title, company_name, company_website, company_logo_url, source_category, employment_type, seniority, remote_scope, location_restriction, country_restrictions, timezone_restrictions, salary_available, salary_min, salary_max, salary_currency, salary_period, posted_at, posted_age_days, description, application_url, detected_skills, matched_keywords, hiring_signal_score, reason_tags, input_keyword, input_source_url, scraped_at.
Sample record — Remote jobs & intelligence
(Real row from a sample run; the description is truncated here for readability.)
{"job_id": "2090910","source": "remotive","source_job_url": "https://remotive.com/remote-jobs/software-development/staff-software-engineer-product-belo-horizonte-2090910","canonical_job_url": "https://remotive.com/remote-jobs/software-development/staff-software-engineer-product-belo-horizonte-2090910","job_title": "Staff Software Engineer, Product (Belo Horizonte)","company_name": "LawnStarter","company_website": null,"company_logo_url": "https://remotive.com/job/2090910/logo","source_category": "software development","employment_type": "full-time","seniority": "lead","remote_scope": "country_restricted","location_restriction": "Brazil","country_restrictions": "Brazil","timezone_restrictions": null,"salary_available": true,"salary_min": 80000,"salary_max": 100000,"salary_currency": "USD","salary_period": "unknown","posted_at": "2026-06-02T07:53:42.000Z","posted_age_days": 4,"description": "This is a remote role for candidates located in Belo Horizonte, Brazil. About LawnStarter — LawnStarter is the nation's leading on-demand marketplace for lawn care and outdoor services...","application_url": "https://remotive.com/remote-jobs/software-development/staff-software-engineer-product-belo-horizonte-2090910","detected_skills": "typescript,php,laravel,react,rest,aws,ai,ux,machine learning","matched_keywords": "software engineer","hiring_signal_score": 98,"reason_tags": "recent_posting,salary_visible,location_restriction_clear,company_present,apply_url_present,skills_detected,keyword_match","input_keyword": "software engineer","input_source_url": null,"scraped_at": "2026-06-07T05:47:42.247Z"}
🎯 Hiring-signal score
Transparent rule-based score (0–100) computed from extracted fields — no AI, no external enrichment.
| Signal | Points |
|---|---|
| Base (any valid remote job row) | +20 |
| Posted within the last 7 days | +15 |
| Posted within the last 30 days (if not 7-day) | +10 |
| Salary visible | +15 |
Remote scope worldwide | +10 |
| Remote scope clearly country/region restricted | +8 |
| Company name present | +10 |
| Application URL present | +10 |
| Detected skills present | +10 |
| Matched a keyword/category filter | +10 |
Score is capped at 100. Bands: high (80–100) · medium (50–79) · low (1–49) · unknown (0).
reason_tags is a comma-separated list explaining the score — e.g. recent_posting, salary_visible, worldwide_remote, location_restriction_clear, skills_detected, keyword_match, company_present, apply_url_present, stale_posting, missing_posted_date.
💰 Pricing
Pay-Per-Event. One flat event per saved row (final per-event price is configured on the Apify console):
| Event | Charged when |
|---|---|
job-result | Once per unique job row that passed all filters and was successfully written to the dataset. |
So your bill is simply results_saved × price_per_event. The actor honors the user-configured per-run spending cap (Apify eventChargeLimitReached): it caps how many results it collects up-front to what the limit can pay for, and stops cleanly the moment the cap is reached during charging.
Not charged:
- Duplicates (deduplicated by
source + job_id, canonical URL, and title+company keys). - Rows filtered out by keyword / category / company / location / remote-scope / salary / date filters.
- Invalid rows (missing title, company, source, or any URL).
- Failed or blocked requests.
🚦 Proxy policy
Use Apify Datacenter proxy or no proxy for normal runs — both work reliably for these public APIs/feeds at this actor's conservative concurrency.
Apify Residential proxy is not supported. The actor will fail at startup if proxyConfiguration.apifyProxyGroups includes RESIDENTIAL. Reason: in pay-per-event actors, residential bandwidth (~/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.
If you genuinely need residential routing, supply your own residential provider via the proxy editor's Custom proxy URLs field — that traffic goes through your provider, not Apify, and is unaffected:
http://user:pass@proxy.iproyal.com:12321http://user:pass@proxy.brightdata.com:22225http://user:pass@proxy.oxylabs.io:7777
📊 Run summary
After each run, a RUN_SUMMARY entry is written to the key-value store:
{"inputs_total": 12,"sources_requested": ["remotive", "remoteok", "weworkremotely"],"successful_sources": ["remotive", "remoteok", "weworkremotely"],"failed_sources": [],"successful_inputs": 12,"failed_inputs": 0,"raw_results_found": 362,"results_saved": 136,"duplicates_removed": 26,"filtered_out": 200,"charged_events": 136,"blocked_requests": 0,"retry_count": 0,"source_counts": { "remotive": 22, "remoteok": 38, "weworkremotely": 76 },"runtime_seconds": 6,"scraped_at": "2026-06-07T05:47:42.247Z"}
inputs_totalis 12 because We Work Remotely fans out across its ~10 category RSS feeds (plus one Remotive and one Remote OK request). Leavingkeywordsempty pushesfiltered_outtoward 0 and returns far more rows.
charged_events equals the number of successfully saved unique rows.
⚙️ Filters
All filters apply after extraction and normalization, and before any dataset push or charge.
| Filter | Effect |
|---|---|
keywords + keywordMatchMode | Match title/company/category/tags/description. any = at least one; all = every keyword. |
categories | Keep only jobs in a matching source category. |
companies | Keep only jobs from matching company names. |
locationKeywords | Keep only jobs whose location/region text matches. |
remoteScope | any / worldwide / country_restricted / region_restricted / unknown. |
salaryRequired | Keep only jobs with a visible salary. |
postedWithinDays | Keep only jobs posted within N days (0 disables; missing date is dropped when N > 0). |
deduplicate | Drop duplicate jobs across sources and inputs (recommended ON). |
Missing values behave conservatively: when a filter is set and the relevant field is missing, the row is filtered out.
🚧 Limitations (V1)
- Public sources only: Remotive public API, Remote OK public JSON feed, We Work Remotely public RSS. No login, cookies, or member-only content.
- Salary parsing is best-effort and only set when numeric compensation is visible; "competitive salary" is not treated as available.
- Remote fields are derived from the visible location/candidate text — they do not infer legal work eligibility beyond what's stated.
detected_skillsis a curated keyword dictionary match (not AI).- No recruiter/contact extraction, email enrichment, company-website crawling, logo downloading, or AI scoring.
maxResultscaps saved unique rows across the whole run (not per source).
❓ FAQ
Do I need an account or API key? No. All three sources are read through their public, unauthenticated API/feeds.
Why are some fields empty?
Sources expose different fields, and the actor never invents values. company_website is not published by any of the three sources, so it is always empty. company_logo_url comes from Remotive (Remote OK currently returns blank logos on its public feed; We Work Remotely RSS has none). Salary is well populated from Remotive, sparse on Remote OK, and absent from WWR RSS. country_restrictions/timezone_restrictions are derived only from visible location text, so "Worldwide"/"Anywhere" jobs correctly stay empty. input_source_url is set only when you use sourceUrls. Missing values are null / false / unknown consistently.
How is remote_scope derived?
From the visible location/candidate-required-location text: Worldwide/Anywhere → worldwide; a single country → country_restricted; a multi-country region (Europe, APAC, …) → region_restricted; otherwise unknown.
Can I paste a source URL?
Yes — put supported URLs (remotive.com / remoteok.com / weworkremotely.com) in sourceUrls. Unsupported URLs are logged as failed inputs and skipped without failing the run.
Can I export to CSV?
Yes — every field is flat. Use Apify's CSV / Excel export, or call the dataset API with format=csv.
🛠️ Technical notes
- Stack: Node.js 22 · Apify SDK 3 · Crawlee
HttpCrawler· Cheerio (RSS/HTML parsing). No browser. - Sources: Remotive
…/api/remote-jobs(JSON), Remote OK…/api(JSON), We Work Remotely category.rssfeeds (XML). - Concurrency:
min=1,max=5(conservative; tune after real runs). - Memory: 1 GB min · 2 GB default · 4 GB max.
- Proxy: Apify Proxy (Datacenter) by default; no-proxy and custom proxy URLs accepted; Apify Residential rejected at startup.