Himalayas Job Scraper avatar

Himalayas Job Scraper

Pricing

Pay per usage

Go to Apify Store
Himalayas Job Scraper

Himalayas Job Scraper

Meet the Himalayas Job Scraper, a lightweight actor designed to efficiently extract remote job listings from Himalayas.app. Fast, reliable, and easy to use. To ensure uninterrupted performance and avoid IP bans, the use of residential proxies is highly recommended.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

14

Total users

4

Monthly active users

3 days ago

Last modified

Share

Himalayas Jobs Scraper

Extract remote job listings from Himalayas into clean, analysis-ready datasets. Collect job titles, companies, locations, compensation details, work type, requirements, and application deadlines at scale. This scraper is built for recruiting teams, talent intelligence workflows, and job market research.

Features

  • Flexible search inputs — Run by keyword and location, or a single startUrl.
  • Rich job coverage — Collect both listing-level and detailed job metadata in one dataset.
  • Automatic pagination — Traverse result pages up to your configured limits.
  • Null-clean output — Empty fields are removed automatically for cleaner downstream processing.
  • Duplicate protection — Canonical URL deduplication avoids repeated job records.
  • Integration-ready data — Export clean records for dashboards, CRMs, ATS tools, and automations.

Use Cases

Talent Intelligence

Track hiring patterns across companies, markets, and job families. Build searchable datasets to identify demand trends by role, region, and employment type.

Recruitment Sourcing

Collect high-quality openings for outreach pipelines. Filter by relevant keywords and locations to prioritize matching opportunities faster.

Salary and Market Analysis

Monitor salary visibility, work arrangements, and role requirements across listings. Use historical runs to evaluate market movement over time.

Job Board Monitoring

Run scheduled collections to detect newly posted opportunities and deadlines. Power alerting workflows for teams that need timely updates.


Input Parameters

ParameterTypeRequiredDefaultDescription
startUrlStringNo"https://himalayas.app/jobs"Start from one Himalayas jobs/search URL.
keywordStringNo"software engineer"Search keyword when URL inputs are not provided.
locationStringNo"United States"Optional location filter for keyword-based runs.
collectDetailsBooleanNotrueInclude extended job information when available.
results_wantedIntegerNo20Maximum number of jobs to save.
max_pagesIntegerNo5Maximum pages to scan per seed URL.
proxyConfigurationObjectNoApify Proxy presetProxy settings for run reliability.

Output Data

Each dataset item may contain the following fields (fields with no value are omitted):

FieldTypeDescription
job_idStringJob identifier when available.
titleStringJob title.
companyStringHiring company name.
company_slugStringCompany slug from listing URL.
job_slugStringJob slug from listing URL.
company_logoStringCompany logo URL.
company_websiteStringCompany profile or website URL.
locationStringConsolidated location text.
applicant_location_requirementsArrayExplicit applicant location requirements.
date_postedStringJob posting date or relative post time.
date_modifiedStringLast updated date when available.
apply_beforeStringApplication deadline when provided.
description_htmlStringRich job description HTML.
description_textStringPlain-text description.
salaryStringHuman-readable salary summary.
salary_minNumberMinimum salary value.
salary_maxNumberMaximum salary value.
salary_currencyStringSalary currency code.
salary_unitStringSalary period unit.
job_typeStringPrimary employment type.
employment_typesArrayFull list of employment types.
industryStringIndustry label.
occupational_categoryStringOccupational category.
work_hoursStringWork schedule or hour details.
direct_applyBooleanIndicates direct application support.
remoteBooleanWhether role is marked remote.
job_location_typeStringLocation type label.
responsibilities_textStringParsed responsibility text.
qualifications_textStringParsed qualification text.
skills_textStringParsed skills text.
experience_requirements_textStringParsed experience requirements.
education_requirements_textStringParsed education requirements.
incentives_textStringParsed incentives or compensation extras.
raw_job_postingObjectFull structured job object for advanced use.
urlStringCanonical job URL.
sourceStringData source label.
scraped_atStringISO timestamp of extraction.

Usage Examples

Collect software engineering jobs with default detail collection:

{
"keyword": "software engineer",
"location": "United States",
"results_wanted": 20
}

URL-Based Collection

Start from a specific jobs URL:

{
"startUrl": "https://himalayas.app/jobs?q=data+engineer",
"results_wanted": 50,
"max_pages": 5,
"collectDetails": true
}

Targeted Keyword Run

Use a focused keyword and location combination:

{
"keyword": "python engineer",
"location": "Canada",
"results_wanted": 100,
"max_pages": 8
}

Lightweight Fast Run

Skip extended details for faster scans:

{
"keyword": "customer success",
"location": "Remote",
"collectDetails": false,
"results_wanted": 30,
"max_pages": 3
}

Sample Output

{
"job_id": "7654321",
"title": "Senior Data Engineer",
"company": "Example Labs",
"company_slug": "example-labs",
"job_slug": "senior-data-engineer",
"company_logo": "https://cdn-images.himalayas.app/example-logo",
"company_website": "https://example.com",
"location": "United States, Canada",
"applicant_location_requirements": ["United States", "Canada"],
"date_posted": "2026-04-01T10:22:00.000Z",
"apply_before": "2026-05-01T10:22:00.000Z",
"description_text": "We are looking for a senior data engineer to build scalable data systems...",
"salary": "120000-160000 USD / YEAR",
"salary_min": 120000,
"salary_max": 160000,
"salary_currency": "USD",
"salary_unit": "YEAR",
"job_type": "FULL_TIME",
"employment_types": ["FULL_TIME"],
"industry": "Software",
"remote": true,
"skills_text": "Python, SQL, ETL, Cloud",
"url": "https://himalayas.app/companies/example-labs/jobs/senior-data-engineer",
"source": "himalayas.app",
"scraped_at": "2026-04-08T08:35:12.220Z"
}

Tips for Best Results

Start Small, Then Scale

  • Use results_wanted around 20-50 for quick validation.
  • Increase limits only after confirming your filters produce relevant jobs.

Use Specific Search Intent

  • Prefer focused queries like "backend engineer" over broad terms like "engineer".
  • Combine keyword and location when you need tighter targeting.

Balance Depth and Speed

  • Keep collectDetails set to true when you need enriched fields.
  • Set collectDetails to false for faster monitoring runs.

Keep Data Clean

  • Canonical URL deduplication is always applied internally to avoid repeated records.
  • Run periodic exports to keep downstream datasets current.

Improve Stability on Large Runs

  • Use proxy configuration when collecting many pages.
  • Increase max_pages gradually to avoid unnecessary retries.

Integrations

Connect your collected data with:

  • Google Sheets — Build collaborative tracking sheets.
  • Airtable — Create searchable recruiting databases.
  • Slack — Send job alerts to hiring channels.
  • Zapier — Trigger no-code automations.
  • Make — Orchestrate multi-step workflows.
  • Webhooks — Deliver records to your own systems.

Export Formats

  • JSON — Structured integration and application use.
  • CSV — Spreadsheet analysis and reporting.
  • Excel — Business-friendly data sharing.
  • XML — Legacy or enterprise integration flows.

Frequently Asked Questions

Can I run with only a URL and no keyword?

Yes. You can provide startUrl without a keyword.

Can I collect jobs from multiple searches in one run?

Run the actor multiple times with different startUrl or keyword/location inputs, then combine datasets downstream.

Why are some fields missing from some jobs?

Different postings expose different metadata. The actor saves all available fields and omits empty ones.

How can I make runs faster?

Lower results_wanted and max_pages, or set collectDetails to false.

How can I avoid duplicate records?

Deduplication is automatic using canonical job URLs.

Is this suitable for scheduled monitoring?

Yes. It works well for periodic runs that track new postings and market movement.


Support

For issues or enhancement requests, use the actor support channel in the Apify Console.

Resources


This actor is intended for legitimate data collection and analysis workflows. Users are responsible for ensuring compliance with applicable laws, website terms, and internal data-governance policies.