Kayak Flights Scraper avatar
Kayak Flights Scraper

Pricing

$5.00 / 1,000 results

Go to Apify Store
Kayak Flights Scraper

Kayak Flights Scraper

Flight scraper for Kayak's API extracting comprehensive flight data including airlines, segments, pricing, and baggage policies. Supports one-way, round-trip, and multi-leg routes with real-time pricing from multiple providers. Returns structured JSON with airport details and flight schedules.

Pricing

$5.00 / 1,000 results

Rating

0.0

(0)

Developer

axly

axly

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

4 days ago

Last modified

Share

Kayak Flights Scraper Actor

An Apify actor that scrapes comprehensive flight search results from Kayak's undocumented API. This actor supports one-way, round-trip, and multi-leg flights with detailed flight information including segments, airlines, and pricing.

Features

  • Multi-leg Support: Search for one-way, round-trip, and complex multi-leg itineraries
  • Comprehensive Data: Extracts detailed flight information including airlines, segments, baggage policies, and booking details
  • Flexible Search: Support for all passenger types (ADT, CHD, INF, etc.) and cabin classes
  • Single Request: Makes one efficient API call that returns all available flight results
  • Structured Output: Returns clean, structured flight data with nested leg and segment information
  • Error Handling: Robust error handling with automatic retries and session management
  • Airport Details: Includes airport codes, names, cities, and station information
  • Real-time Pricing: Extracts current pricing and provider information

Input Schema

The actor accepts input in the following JSON format:

{
"legs": [
{
"origin": {
"locationType": "airports",
"airports": ["JFK"]
},
"destination": {
"locationType": "airports",
"airports": ["LAX"]
},
"date": "2025-12-01",
"flex": "0"
},
{
"origin": {
"locationType": "airports",
"airports": ["LAX"]
},
"destination": {
"locationType": "airports",
"airports": ["JFK"]
},
"date": "2025-12-08",
"flex": "0"
}
],
"passengers": ["ADT", "ADT"],
"cabin_class": "economy",
"max_results": 10000
}

Required Parameters

  • legs (array): Array of flight legs defining the itinerary

    • origin: Departure location specification
      • locationType: Location type (currently supports "airports")
      • airports: Array of IATA airport codes (e.g., ["JFK", "LGA"])
    • destination: Arrival location specification (same format as origin)
    • date (string): Departure date in YYYY-MM-DD format
    • flex (string): Date flexibility (± days, usually "0" for exact dates)
  • passengers (array): Array of passenger type codes

    • "ADT": Adult (12+ years)
    • "CHD": Child (2-11 years)
    • "INF": Infant (<2 years)
    • "SNR": Senior (65+ years)
    • "YTH": Youth/Student
    • "STD": Student
  • cabin_class (string): Travel class preference

    • "economy": Standard economy
    • "premium-economy": Premium economy
    • "business": Business class
    • "first": First class

Optional Parameters

  • max_results (integer): Maximum number of flight results to return
    • Range: 1-10000
    • Default: Unlimited (returns all available flights)
    • Omit this field to get all available results

Output Format

The actor outputs comprehensive flight data with detailed leg and segment information:

{
"search_id": "abc123def456",
"result_id": "result_123456",
"price": 49999,
"total_price": 49999,
"currency_code": "USD",
"duration_minutes": 360,
"is_info_price": false,
"checked_bags_count": 1,
"carry_on_bags_count": 1,
"is_carry_on_prohibited": false,
"cheapest_provider_name": "American Airlines",
"cheapest_provider_booking_id": "AA123456",
"save_for_later_enabled": true,
"number_of_providers": 5,
"legs": [
{
"id": "leg_123",
"duration_minutes": 360,
"stops_count": 0,
"has_missing_segments": false,
"origin": {
"code": "JFK",
"name": "John F. Kennedy International Airport",
"city": "New York",
"city_code": "NYC"
},
"destination": {
"code": "LAX",
"name": "Los Angeles International Airport",
"city": "Los Angeles",
"city_code": "LAX"
},
"departure_time": "2025-12-01T08:30:00",
"arrival_time": "2025-12-01T11:30:00",
"airlines": [
{
"code": "AA",
"name": "American Airlines",
"logo_url": "https://..."
}
],
"segments": [
{
"id": "segment_456",
"flight_number": "AA123",
"airline": {
"code": "AA",
"name": "American Airlines",
"logo_url": "https://..."
},
"origin": {
"code": "JFK",
"name": "John F. Kennedy International Airport",
"city": "New York"
},
"destination": {
"code": "LAX",
"name": "Los Angeles International Airport",
"city": "Los Angeles"
},
"departure_time": "2025-12-01T08:30:00",
"arrival_time": "2025-12-01T11:30:00",
"duration_minutes": 360,
"equipment_type": "Boeing 737",
"transport_type": "CommercialFlight"
}
]
}
],
"search_timestamp": "2025-11-09T12:00:00",
"last_updated": "2025-11-09T12:00:00"
}

Location Types

Currently, the actor supports airport-based location specifications:

Airports Location

{
"locationType": "airports",
"airports": ["JFK", "LGA", "EWR"]
}

Usage Examples

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["JFK"]},
"destination": {"locationType": "airports", "airports": ["LAX"]},
"date": "2025-12-15",
"flex": "0"
},
{
"origin": {"locationType": "airports", "airports": ["LAX"]},
"destination": {"locationType": "airports", "airports": ["JFK"]},
"date": "2025-12-22",
"flex": "0"
}
],
"passengers": ["ADT", "ADT"],
"cabin_class": "economy"
}

One-way Business Flight with Flexibility

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["SFO"]},
"destination": {"locationType": "airports", "airports": ["ORD"]},
"date": "2025-11-15",
"flex": "3"
}
],
"passengers": ["ADT"],
"cabin_class": "business",
"max_results": 25
}

Multi-city Premium Economy

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["JFK"]},
"destination": {"locationType": "airports", "airports": ["CDG"]},
"date": "2025-12-01",
"flex": "2"
},
{
"origin": {"locationType": "airports", "airports": ["CDG"]},
"destination": {"locationType": "airports", "airports": ["FCO"]},
"date": "2025-12-10",
"flex": "1"
},
{
"origin": {"locationType": "airports", "airports": ["FCO"]},
"destination": {"locationType": "airports", "airports": ["JFK"]},
"date": "2025-12-20",
"flex": "2"
}
],
"passengers": ["ADT", "CHD"],
"cabin_class": "premium-economy"
}

Usage

  1. Deploy to Apify: Push this actor to your Apify account
  2. Configure Input: Use the input schema above or provided test files
  3. Run Actor: Execute with your flight search parameters
  4. Retrieve Results: Download comprehensive flight data from the dataset

Dependencies

  • apify>=1.0.0 - Apify SDK
  • httpx>=0.25.0 - Async HTTP client
  • dataclasses>=0.6 - Data structure support (Python 3.7+)
  • typing>=3.7.4 - Type hints support

Development

Project Structure

kayak-flights-scraper/
├── src/
│ ├── __init__.py
│ ├── main.py # Main actor implementation
│ └── kayak.py # Kayak API client
├── extract_flights.py # Flight data parser (reference)
├── input_schema.json # Input validation schema
├── output_schema.json # Output documentation schema
├── requirements.txt # Python dependencies
├── README.md # This documentation
├── gen_icon.py # Icon generation script
├── icon.png # Actor icon
├── input.json # Default test input
├── input_oneway.json # One-way test input
├── input_multicity.json # Multi-city test input
└── README_test_inputs.md # Test input documentation

Key Components

  • KayakFlightParser: Comprehensive parser for Kayak's API response format
  • FlightResult: Structured data model for flight results
  • FlightPollResponse: Complete response data structure
  • Single-request architecture: Efficient API usage with no polling

Error Handling

The actor includes robust error handling:

  • Network Errors: Automatic retries for connection issues
  • API Rate Limiting: Graceful handling of 403/429 responses
  • Server Errors: Retry logic for 5xx status codes
  • Data Validation: Comprehensive input validation
  • Parsing Errors: Graceful handling of malformed API responses
  • Session Management: Automatic session establishment and renewal

Logging

Detailed logging throughout execution:

  • API session establishment
  • Request/response logging
  • Data extraction progress
  • Error conditions and recovery
  • Performance metrics and statistics

Performance & Reliability

  • Single API Call: Efficient single-request architecture
  • Async Processing: Non-blocking I/O operations
  • Memory Efficient: Streams large result sets
  • Session Reuse: Maintains API sessions for reliability
  • Comprehensive Parsing: Handles complex nested flight data structures

Data Quality

  • Complete Flight Details: Airlines, segments, baggage, pricing
  • Airport Information: Codes, names, cities, station types
  • Real-time Data: Current pricing and availability
  • Multiple Providers: Coverage across booking providers
  • Structured Format: Consistent, queryable JSON output

Resources