Pricing

$10.00 / 1,000 results

Go to Apify Store

Job Listings Aggregator Pro

Try for free

Developed by

Gideon Nesh

Job Listings Aggregator – Find Jobs Fast! Search 8+ top job boards (LinkedIn, Indeed, RemoteOK, Dice, more) in one click. Get Python, tech & remote roles with smart deduplication, keyword filters & instant results. Supercharge your job hunt with this powerful, all-in-one Python scraper!

0.0 (0)

Pricing

$10.00 / 1,000 results

Last modified

21 days ago

Jobs

Automation

Developer tools

Job Listings Aggregator

A powerful Python application that crawls and aggregates job listings from multiple public job boards, providing a unified interface to search, filter, and manage job opportunities.

Features

Core Features

Multi-source scraping: Fetch jobs from RemoteOK, We Work Remotely, and easily extensible to other job boards
Intelligent data processing: Automatic normalization and deduplication of job listings
Flexible storage: Support for both JSON files and SQLite database storage
Advanced filtering: Filter by keywords, location, job type, company, and more
Export capabilities: Export results to CSV or JSON formats
CLI interface: Easy-to-use command-line interface

Advanced Features

Automated scheduling: Set up daily automated scraping
Data normalization: Consistent formatting across different job sources
Duplicate detection: Smart deduplication based on job similarity
Comprehensive logging: Detailed logging for monitoring and debugging
Modular architecture: Easy to extend with new job boards

Installation

Clone the repository:

git clone <repository-url>
cd job-listings-aggregator

Install dependencies:

$pip install -r requirements.txt

Install Playwright browsers (optional, for JS-heavy sites):

$playwright install

Configuration

The application can be configured by modifying config.py:

# Storage settings
STORAGE_TYPE = 'json'  # or 'sqlite'

# Scraping settings
MAX_JOBS_PER_BOARD = 100
REQUEST_DELAY = 1  # seconds between requests

# Scheduler settings
SCHEDULE_TIME = '09:00'  # Daily scraping time
ENABLE_SCHEDULER = True

Usage

Command Line Interface

1. Scrape Jobs

# Scrape all jobs from enabled job boards
python main.py scrape

# Scrape with specific keywords
python main.py scrape -k python -k developer -k remote

# Limit jobs per board
python main.py scrape -m 50

2. Search Jobs

# Search by keyword
python main.py search -k python

# Search for remote jobs
python main.py search --remote

# Search by location
python main.py search -l "San Francisco"

# Complex search with multiple filters
python main.py search -k "data scientist" -l remote --company google

# Export search results
python main.py search -k python --export results.csv
python main.py search --remote --export jobs.json

3. View Statistics

# Show comprehensive statistics
python main.py stats

4. Manage Data

# Clear all saved jobs
python main.py clear

5. Scheduler Management

# Start automated daily scraping
python main.py schedule --start

# Stop the scheduler
python main.py schedule --stop

# Check scheduler status
python main.py schedule --status

# Run scraping immediately
python main.py schedule --run-now

Python API

You can also use the application programmatically:

from main import JobAggregator

# Initialize the aggregator
app = JobAggregator()

# Scrape jobs with keywords
jobs = app.scrape_all_jobs(keywords=['python', 'remote'])

# Process and save jobs
processed_jobs = app.process_jobs(jobs)
app.save_jobs(processed_jobs)

# Search saved jobs
remote_jobs = app.search_jobs(remote_only=True, keyword='developer')

# Print job details
for job in remote_jobs[:5]:
    print(f"{job.title} at {job.company}")
    print(f"Location: {job.location}")
    print(f"Link: {job.application_link}")
    print("---")

Architecture

Project Structure

job-listings-aggregator/
├── main.py                 # CLI interface and main application
├── config.py              # Configuration settings
├── requirements.txt       # Python dependencies
├── models/
│   ├── __init__.py
│   └── job_listing.py     # Job data model
├── scrapers/
│   ├── __init__.py
│   ├── base_scraper.py    # Abstract scraper base class
│   ├── remoteok_scraper.py
│   └── weworkremotely_scraper.py
├── storage/
│   ├── __init__.py
│   ├── base_storage.py    # Abstract storage interface
│   ├── json_storage.py    # JSON file storage
│   └── sqlite_storage.py  # SQLite database storage
├── utils/
│   ├── __init__.py
│   ├── normalizer.py      # Data normalization utilities
│   └── deduplicator.py    # Duplicate detection
├── filters/
│   ├── __init__.py
│   └── job_filters.py     # Advanced filtering
├── scheduler/
│   ├── __init__.py
│   └── job_scheduler.py   # Automated scheduling
└── data/                  # Generated data directory
    ├── job_listings.json
    └── job_listings.db

Data Model

Each job listing contains:

title: Job title
company: Company name
location: Job location
job_type: Employment type (Full-time, Remote, etc.)
date_posted: When the job was posted
description: Job description snippet
application_link: URL to apply
source: Which job board it came from
id: Unique identifier
scraped_at: When it was scraped

Adding New Job Boards

To add a new job board, create a new scraper class:

# scrapers/newboard_scraper.py
from .base_scraper import BaseScraper
from models.job_listing import JobListing

class NewBoardScraper(BaseScraper):
    def __init__(self):
        super().__init__('newboard')
    
    def scrape_jobs(self, keywords=None, max_jobs=None):
        # Implement scraping logic
        jobs = []
        # ... scraping code ...
        return jobs
    
    def parse_job_element(self, job_element):
        # Implement parsing logic for individual job elements
        # Return JobListing object
        pass

Then update config.py and the main application to include the new scraper.

Filtering Options

The application supports various filtering criteria:

keyword: Match in title, description, or job type
keywords: Multiple keywords (match any or all)
location: Match job location
remote_only: Filter for remote jobs only
source: Filter by job board source
company: Filter by company name
job_type: Filter by employment type
date_range: Jobs posted within N days
exclude_keywords: Exclude jobs with certain keywords
min_description_length: Minimum description length
custom_filter: Apply custom filter function

Storage Options

JSON Storage (Default)

Human-readable format
Easy to inspect and edit
Good for smaller datasets
Portable across systems

SQLite Storage

Better performance for large datasets
SQL query capabilities
ACID compliance
Indexing for faster searches

Switch between storage types by setting STORAGE_TYPE in config.py.

Logging

The application provides comprehensive logging:

Console output for user feedback
File logging to job_aggregator.log
Different log levels for components
Structured logging for monitoring

Best Practices

Respectful Scraping

Built-in delays between requests
Retry logic with exponential backoff
User-Agent headers
Respect robots.txt (implement if needed)

Data Quality

Automatic data normalization
Duplicate detection and removal
Input validation and cleaning
Error handling and logging

Performance

Configurable limits on jobs per board
Efficient storage options
Background scheduling
Memory-conscious processing

Troubleshooting

Common Issues

No jobs found: Check if job boards have changed their HTML structure
Connection errors: Verify internet connection and site availability
Import errors: Ensure all dependencies are installed
Permission errors: Check file/directory permissions for data storage

Debug Mode

Run with verbose logging:

$python main.py -v scrape

Logs

Check the log file for detailed error information:

$tail -f job_aggregator.log

Contributing

Fork the repository
Create a feature branch
Add new scrapers, filters, or storage backends
Include tests for new functionality
Submit a pull request

License

This project is open source. Please use responsibly and respect job board terms of service.

Disclaimer

This tool is for educational and personal use. Always check and comply with the terms of service of the job boards you're scraping. Be respectful with request rates and consider using official APIs when available.

Dice.com Jobs Scraper

piotrv1001/dice-com-jobs-scraper

The Dice.com Jobs Scraper extracts US tech job listings from Dice.com based on search keywords and location (state), capturing salary details, remote work status, company logo, job URL, and job descriptions. Ideal for job market analysis and recruitment insights.

Piotr Vassev

Dice Jobs Scraper

worldunboxer/dice-jobs-scraper

Boost your job search with our Dice Job Scraper! Easily extract job listings, company details, salaries, and full job descriptions from Dice.com. Automate job scraping with high accuracy and efficiency. Perfect for recruiters, analysts, and job seekers. Get real-time job data instantly!

Umesh Patidar

5.0

Linkedin Indeed Glassdoor Job Scraper

gauravsaran/linkedin-indeed-glassdoor-job-scraper

🚀 Find your dream job instantly! Search Indeed, LinkedIn & Glassdoor simultaneously. Get hundreds of jobs with salary data, remote filters & company details in seconds. Perfect for job seekers, recruiters & HR teams. Works globally in 60+ countries. Fast, reliable & easy to use!

ScrapeForge

Dice.com Job Scraper

easyapi/dice-com-job-scraper

Unlock the tech job market with our Dice.com Job Scraper! Extract detailed listings effortlessly, including salaries, remote options, and more. Perfect for recruiters, job seekers, and researchers. Get valuable insights into the latest tech career opportunities!

EasyApi

5.0

Dice.com US Tech Jobs Scraper

lexis-solutions/dice-com-jobs-scraper

Scrape US tech jobs data from dice.com. Export to Excel, CSV, JSON, or API with Apify. Extract job listings, salaries, locations, and more.

Lexis Solutions

186

5.0

Indeed Jobs Scraper

caprolok/indeed-jobs-scraper

Powerful, high-speed scraper extracting detailed job listings from Indeed.com. Quickly get structured data like job title, company, salary, location, posting date, descriptions, and direct job links. Perfect for recruiters, market analysis, and job seekers.

Caprolok

100

Ziprecruiter Jobs Scraper Pro

hello.datawizard-owner/ziprecruiter-Jobs-Scraper-Pro

The ZipRecruiter-Jobs-Scraper-Pro Apify Actor extracts detailed job listings from ZipRecruiter in JSON format. Ideal for job market analysis and recruitment automation, it supports custom queries, locations, and Apify Proxy. Built by DataWizards for fast, reliable job data extraction.

datawizards

Ziprecruiter.ie Jobs Details Scraper

ecomscrape/ziprecruiter-jobs-details-scraper

Powerful Ziprecruiter.ie job scraper streamlines recruitment data collection. Capture complete job details, generate professional reports, and integrate with existing HR tools. Save hundreds of hours while building comprehensive talent databases effortlessly.

ecomscrape

ZipRecruiter.com Job Listings Scraper

memo23/apify-ziprecruiter-scraper

Unlock the power of millions of job listings with our ZipRecruiter Scraper – Your gateway to real-time labor market insights! Navigate the job market like a pro with our ZipRecruiter Scraper. From salary trends to skill demands, access the data you need to stay ahead in today's competitive landscape

Muhamed Didovic

175

5.0

Glassdoor + ZipRecruiter

canadesk/glassdoor-ziprecruiter

Get Job Postings from Glassdoor and ZipRecruiter from all over the world. It's fast and costs little.

Canadesk Support

112

Job Listings Aggregator Pro

Job Listings Aggregator Pro

Job Listings Aggregator

Features

Core Features

Advanced Features

Installation

Configuration

Usage

Command Line Interface

1. Scrape Jobs

2. Search Jobs

3. View Statistics

4. Manage Data

5. Scheduler Management

Python API

Architecture

Project Structure

Data Model

Adding New Job Boards

Filtering Options

Storage Options

JSON Storage (Default)

SQLite Storage

Logging

Best Practices

Respectful Scraping

Data Quality

Performance

Troubleshooting

Common Issues

Debug Mode

Logs

Contributing

License

Disclaimer

You might also like

Dice.com Jobs Scraper

Dice Jobs Scraper

Linkedin Indeed Glassdoor Job Scraper

Dice.com Job Scraper

Dice.com US Tech Jobs Scraper

Indeed Jobs Scraper

Ziprecruiter Jobs Scraper Pro

Ziprecruiter.ie Jobs Details Scraper

ZipRecruiter.com Job Listings Scraper

Glassdoor + ZipRecruiter