OLX Brasil Carros Data Scraper avatar
OLX Brasil Carros Data Scraper

Pricing

$24.00/month + usage

Go to Apify Store
OLX Brasil Carros Data Scraper

OLX Brasil Carros Data Scraper

Developed by

Israel Oriente

Israel Oriente

Maintained by Community

🚗💨 Scrape OLX Brazil car listings with precision! Advanced filters: state, brand, price, mileage, year, color, FIPE. 🛡️ Anti-bot tech, unlimited pagination. 📊 Extract complete vehicle data + high-res photos. 🇧🇷 180+ brands, all states!

5.0 (1)

Pricing

$24.00/month + usage

3

4

4

Last modified

5 days ago

OLX Brazil Car Scraper 🚗

Apify Actor TypeScript Crawlee

Professional web scraper for extracting vehicle listings from OLX Brazil (olx.com.br) with advanced filtering capabilities and anti-bot detection.

Extract comprehensive vehicle data from Brazil's largest classified ads platform with precision filtering by state, brand, price range, mileage, year, color, and FIPE table comparison.


🎯 What You Can Do With This Actor

Extract Vehicle Data

  • Complete vehicle information: title, price, FIPE price, brand, model, year, mileage, fuel type, color, multiple high-resolution photos, and detailed descriptions
  • Unlimited pagination: Automatically navigate through all listing pages to collect up to 300 ads per run
  • Smart duplicate detection: Ensures unique results without repetition

Advanced Filtering Options

  • Geographic targeting: Filter by any Brazilian state (all 27 states supported)
  • Brand filtering: Select from 180+ car brands available on OLX
  • Price range: Set minimum and maximum price limits (ps/pe parameters)
  • Mileage range: Filter vehicles by kilometer range (ms/me parameters)
  • Year range: Target specific production years (rs/re parameters)
  • Color selection: Multi-select from 10 color options (Preto, Branco, Prata, Cinza, Azul, Vermelho, Verde, Amarelo, Laranja, Outra)
  • FIPE comparison: Filter only vehicles priced below FIPE table value
  • Free-text search: Combine with any search query for specific models or features

Anti-Bot Protection

  • Puppeteer Real Browser: Utilizes puppeteer-real-browser to bypass anti-bot detection systems
  • Human-like behavior: Implements realistic scrolling, mouse movements, and navigation patterns
  • Rotating sessions: Automatic session management for long-running scrapes

High Performance

  • Optimized concurrency: 5 parallel browsers with intelligent resource management
  • Aggressive resource blocking: Blocks images, fonts, stylesheets, and trackers for 3x faster execution
  • Memory efficient: Smart pagination and data streaming to Dataset storage
  • Throughput: ~25-28 ads per minute on average

📋 Input Configuration

Required Fields

  • state (select): Brazilian state code (UF) - Choose from AC, AL, AP, AM, BA, CE, DF, ES, GO, MA, MT, MS, MG, PA, PB, PR, PE, PI, RJ, RN, RS, RO, RR, SC, SP, SE, TO

Optional Filters

  • brand (select): Car brand - 180+ options including Toyota, Volkswagen, Chevrolet, Fiat, Ford, Honda, Hyundai, Nissan, etc.
  • ps (integer): Minimum price in BRL (e.g., 30000)
  • pe (integer): Maximum price in BRL (e.g., 150000)
  • mileage_from (integer): Minimum mileage in kilometers (e.g., 0)
  • mileage_to (integer): Maximum mileage in kilometers (e.g., 50000)
  • year_from (integer): Minimum production year (e.g., 2018)
  • year_to (integer): Maximum production year (e.g., 2024)
  • colors (multiselect): Vehicle colors - Select one or multiple: Preto, Branco, Prata, Cinza, Azul, Vermelho, Verde, Amarelo, Laranja, Outra
  • fpdll (boolean): Filter vehicles priced below FIPE table value
  • search (string): Free-text search query (e.g., "polo tsi", "corolla xei")
  • ads_limit (integer): Maximum number of ads to scrape (1-300, default: 10)

🚀 Quick Start

Example 1: Basic Search - São Paulo State

{
"state": "sp",
"ads_limit": 50
}

Example 2: Filtered Search - Used Toyota in Rio de Janeiro

{
"state": "rj",
"brand": "Toyota",
"year_from": 2015,
"year_to": 2023,
"mileage_to": 80000,
"ads_limit": 100
}

Example 3: Advanced Search - Affordable Black/Silver Cars Below FIPE

{
"state": "sp",
"ps": 30000,
"pe": 80000,
"colors": ["Preto", "Prata"],
"fpdll": true,
"mileage_to": 100000,
"ads_limit": 200
}
{
"state": "mg",
"search": "civic touring",
"brand": "Honda",
"year_from": 2019,
"ads_limit": 50
}

📊 Output Data Format

Each scraped vehicle returns a structured JSON object:

{
"url": "https://pb.olx.com.br/paraiba/autos-e-pecas/carros-vans-e-utilitarios/honda-civic-touring-2020-1234567890",
"title": "Honda Civic Touring 2.0 16V Flex Aut. 2020",
"price": 115000,
"fipe_price": 120500,
"brand": "Honda",
"model": "Civic Touring",
"year": "2020",
"mileage": "45000",
"fuel": "Flex",
"color": "Preto",
"photos": [
"https://img.olx.com.br/images/12/123456789012345678901234567890.jpg",
"https://img.olx.com.br/images/12/123456789012345678901234567891.jpg"
],
"description": "Honda Civic Touring 2020, única dona, todas as revisões em concessionária, IPVA 2024 pago, aceito troca..."
}

Data Fields Explanation

  • url: Direct link to the vehicle listing
  • title: Vehicle title/headline from the ad
  • price: Advertised price in BRL (number format)
  • fipe_price: FIPE table reference price in BRL (when available)
  • brand: Vehicle manufacturer/brand
  • model: Vehicle model name
  • year: Production year (or model year if specified)
  • mileage: Odometer reading in kilometers
  • fuel: Fuel type (Flex, Gasolina, Diesel, Elétrico, Híbrido, etc.)
  • color: Vehicle color
  • photos: Array of high-resolution image URLs (only img.olx.com.br canonical URLs)
  • description: Full advertisement description text

🔧 How It Works

1. URL Construction

The Actor builds optimized OLX search URLs based on your input filters:

https://www.olx.com.br/autos-e-pecas/carros-vans-e-utilitarios/[brand]/estado-[state]?sf=1&ps=[min_price]&pe=[max_price]&ms=[min_km]&me=[max_km]&rs=[min_year]&re=[max_year]&cac=[color_code]&fpdll=2&q=[search]

2. Smart Pagination

  • Automatically detects and clicks through pagination elements (.listing-pagination, #listing-pagination)
  • Supports multiple pagination strategies: "next" buttons, page numbers, URL parameter manipulation
  • Stops when reaching the ads limit or when no new ads are found (intelligent end-detection)

3. Data Extraction

  • Price parsing: Extracts prices from multiple sources (meta tags, JSON-LD, visible elements, #price-box-container)
  • FIPE detection: Identifies FIPE reference prices when available
  • Photo normalization: Filters and canonicalizes image URLs to ensure quality
  • Description expansion: Automatically clicks "Ver descrição completa" to reveal full text

4. Anti-Detection

  • Uses puppeteer-real-browser with Turnstile bypass
  • Implements human-like scrolling patterns (20 rounds, 150ms delays)
  • Blocks unnecessary resources (95% faster page loads)
  • Custom User-Agent rotation

5. Quality Assurance

  • Deduplicates images while preserving order
  • Validates and cleans all extracted data
  • Handles missing fields gracefully
  • Retry logic for failed requests (3 attempts with exponential backoff)

🎨 Use Cases

1. Market Research & Price Analysis

  • Track vehicle pricing trends across different regions
  • Compare asking prices vs. FIPE table values
  • Analyze price variations by brand, model, and year
  • Identify underpriced vehicles for investment opportunities

2. Inventory Management for Dealerships

  • Monitor competitor listings in real-time
  • Build comprehensive vehicle databases for comparison
  • Track market availability by filters (brand, year, price range)
  • Export data to CRM or inventory management systems

3. Data Analytics & Business Intelligence

  • Build datasets for machine learning models (price prediction, demand forecasting)
  • Generate market reports and dashboards
  • Analyze seasonal trends and regional preferences
  • Color popularity analysis by region

4. Consumer Tools

  • Create price comparison tools for car buyers
  • Build alerts for specific vehicle criteria
  • Generate automated reports for dream car searches
  • Track historical pricing for negotiation insights

5. Integration with Other Services

  • Feed data to Zapier, Make.com, or n8n workflows
  • Export to Google Sheets for analysis
  • Send to Airtable for collaborative databases
  • Integrate with WhatsApp/Telegram bots for instant alerts

⚙️ Advanced Configuration

Resource Optimization

The Actor is pre-configured for maximum performance:

  • Concurrency: 5 parallel browser instances
  • Request rate: 120 requests/minute
  • Navigation timeout: 25 seconds
  • Request timeout: 40 seconds (listing), 75 seconds (detail)
  • Browser pool: 3 pages per browser, 100 page reuse limit

Memory Recommendations

  • Small runs (≤50 ads): 2048 MB
  • Medium runs (50-150 ads): 4096 MB
  • Large runs (150-300 ads): 8192 MB

Local Development

# Install dependencies
npm install
# Build TypeScript
npm run build
# Run locally with custom input
APIFY_INPUT='{"state":"sp","brand":"Toyota","ads_limit":10}' npm run start:dev
# Or use input file
APIFY_INPUT_FILE=storage/key_value_stores/default/INPUT.json npm run start:dev

📈 Performance Metrics

Throughput

  • Average: ~25-28 ads per minute
  • Best case: ~30+ ads per minute (with optimal memory allocation)
  • Time per ad: ~2.5 seconds average (listing + detail extraction)

Success Rate

  • 95%+ successful data extraction on first attempt
  • 99%+ with retry logic enabled
  • Graceful handling of missing/incomplete data

Resource Usage

  • CPU: Moderate (0-40% average with 5 concurrent browsers)
  • Memory: Scales with ads_limit (see recommendations above)
  • Network: Optimized with aggressive resource blocking (images, fonts, stylesheets blocked)

🔒 Privacy & Compliance

  • No authentication required: Scrapes only publicly available data
  • Respects robots.txt: Implements rate limiting and polite crawling
  • No personal data: Extracts only vehicle listing information
  • GDPR/LGPD friendly: No user tracking or personal data collection

🛠️ Troubleshooting

Issue: Actor times out or runs slowly

Solution: Increase memory allocation to 4096-8192 MB for runs with 100+ ads

Issue: No results returned

Solution: Verify your filter combination - some combinations may have zero listings on OLX

Issue: Pagination stops early

Solution: OLX may limit results for certain searches - try narrowing filters or reducing ads_limit

Issue: Missing photos or descriptions

Solution: Some listings have incomplete data - this is expected behavior, Actor extracts all available fields


🔗 Integration Examples

Google Sheets

  1. Run the Actor
  2. Use Apify's Google Sheets integration
  3. Automatically export results to your spreadsheet
  4. Set up scheduled runs for continuous monitoring

Zapier/Make

  1. Create a Zap/Scenario triggered by Actor completion
  2. Process extracted data (filter, transform)
  3. Send to Slack, email, Airtable, or 1000+ apps

API Access

# Start Actor run via API
curl -X POST https://api.apify.com/v2/acts/YOUR_USERNAME~olx-cars-scraper/runs \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"state":"sp","brand":"Toyota","ads_limit":100}'
# Get results
curl https://api.apify.com/v2/acts/YOUR_USERNAME~olx-cars-scraper/runs/LAST/dataset/items \
-H "Authorization: Bearer YOUR_API_TOKEN"

📚 Resources


🏷️ Keywords

OLX Brazil scraper, web scraping, car listings, vehicle data extraction, Brazilian automobiles, FIPE price, used cars Brazil, TypeScript scraper, Puppeteer crawler, Apify Actor, anti-bot detection, classified ads scraper, automotive data, market research, price comparison, dealership tools, carros usados, veículos Brasil, OLX extractor, real estate scraper alternative


📝 Version History

v1.0.0 (Current)

  • ✅ Advanced filtering: price range, colors, FIPE comparison
  • ✅ Multi-state support (all 27 Brazilian states)
  • ✅ 180+ car brands supported
  • ✅ Unlimited pagination with smart end-detection
  • ✅ Anti-bot detection with puppeteer-real-browser
  • ✅ High-performance extraction (~28 ads/minute)
  • ✅ Complete data extraction (photos, FIPE, descriptions)
  • ✅ Memory-optimized for large datasets

🤝 Contributing

Found a bug or have a feature request? Please open an issue or contact support through Apify platform.


📄 License

This Actor is provided as-is for data extraction from publicly available sources. Users are responsible for compliance with OLX's terms of service and applicable data protection laws.


Built with ❤️ using Apify and Crawlee

Last updated: October 2025