TrovoLavoro Job Scraper
Pricing
$15.00/month + usage
Go to Apify Store

TrovoLavoro Job Scraper
Scraper Component Extracts structured data from job postings: Job Title Company Name Company Domain (auto-generated) Job Location Description (with keyword filtering) Job Post URL Date Posted Employment Type (full-time, contract, etc.) Salary Status (active/closed)
Pricing
$15.00/month + usage
Rating
0.0
(0)
Developer

Fredrick Otieno
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Indeed Scrapper & Crawler
A combined web crawler and scraper for extracting job listings from Indeed.com with a Web UI.
Quick Start (Web UI)
The easiest way to use this scrapper is with the built-in web interface:
# Install dependenciespip install -r requirements.txt# Run the web serverpython app.py
Then open your browser to: http://127.0.0.1:5000
The web UI lets you:
- Configure search parameters visually
- Watch the scraping progress in real-time
- View results in a sortable table
- Export to CSV or JSON with one click
Features
Web UI
- Beautiful, responsive interface
- Real-time progress tracking
- Live status updates
- One-click export to CSV/JSON
- Configure all options without editing code
Crawler Component
- Discovers job URLs across multiple search result pages
- Follows pagination automatically
- Manages URL queue to avoid duplicates
- Configurable search keywords and locations
Scraper Component
- Extracts structured data from job postings:
- Job Title
- Company Name
- Company Domain (auto-generated)
- Job Location
- Description (with keyword filtering)
- Job Post URL
- Date Posted
- Employment Type (full-time, contract, etc.)
- Salary
- Status (active/closed)
Installation
$pip install -r requirements.txt
Usage
Option 1: Edit config.py
- Edit
config.pywith your settings:
SEARCH_KEYWORDS = "python developer"SEARCH_LOCATION = "Remote"DESCRIPTION_KEYWORDS = ["django", "fastapi"]MAX_PAGES = 10MAX_JOBS = 100
- Run the scraper:
$python indeed_scrapper.py
Option 2: Use as a module
from indeed_scrapper import IndeedCrawlerScraperscraper = IndeedCrawlerScraper(search_keywords="data scientist",search_location="New York, NY",description_keywords=["python", "sql", "machine learning"],max_pages=5,max_jobs=50,headless=False)scraper.run(output_format="csv")
Output Files
indeed_jobs.csv- Job data in CSV formatindeed_jobs.json- Job data in JSON format (if enabled)
Configuration Options
| Setting | Description |
|---|---|
SEARCH_KEYWORDS | Job title/keywords to search |
SEARCH_LOCATION | Location (city, state, or "Remote") |
DESCRIPTION_KEYWORDS | Filter jobs by description keywords |
MAX_PAGES | Number of search pages to crawl |
MAX_JOBS | Maximum jobs to scrape |
HEADLESS | Run browser in background (True/False) |
OUTPUT_FORMAT | "csv", "json", or "both" |
Notes
- The scraper uses Selenium to handle JavaScript-rendered content
- Delays are built in to avoid being blocked
- Company domains are auto-generated from company names
- Status is marked "active" if the job page is accessible