Kayak Flights Scraper avatar
Kayak Flights Scraper

Pricing

$5.00 / 1,000 results

Go to Apify Store
Kayak Flights Scraper

Kayak Flights Scraper

Flight scraper for Kayak's API extracting comprehensive flight data including airlines, segments, pricing, and baggage policies. Supports one-way, round-trip, and multi-leg routes with real-time pricing from multiple providers. Returns structured JSON with airport details and flight schedules.

Pricing

$5.00 / 1,000 results

Rating

1.0

(1)

Developer

axly

axly

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

7.9 hours

Issues response

25 days ago

Last modified

Share

Kayak Flights Scraper Actor

An Apify actor that scrapes comprehensive flight search results from Kayak's undocumented API. This actor supports one-way, round-trip, and multi-leg flights with detailed flight information including segments, airlines, and pricing.

Features

  • Multi-leg Support: Search for one-way, round-trip, and complex multi-leg itineraries
  • Comprehensive Data: Extracts detailed flight information including airlines, segments, baggage policies, and booking details
  • Flexible Search: Support for all passenger types (ADT, CHD, INF, etc.) and cabin classes
  • Single Request: Makes one efficient API call that returns all available flight results
  • Structured Output: Returns clean, structured flight data with nested leg and segment information
  • Error Handling: Robust error handling with automatic retries and session management
  • Airport Details: Includes airport codes, names, cities, and station information
  • Real-time Pricing: Extracts current pricing and provider information

Input Schema

The actor accepts input in the following JSON format:

{
"legs": [
{
"origin": {
"locationType": "airports",
"airports": ["JFK"]
},
"destination": {
"locationType": "airports",
"airports": ["LAX"]
},
"date": "2025-12-01",
"flex": "0"
},
{
"origin": {
"locationType": "airports",
"airports": ["LAX"]
},
"destination": {
"locationType": "airports",
"airports": ["JFK"]
},
"date": "2025-12-08",
"flex": "0"
}
],
"passengers": ["ADT", "ADT"],
"cabin_class": "economy",
"max_results": 10000
}

Required Parameters

  • legs (array): Array of flight legs defining the itinerary

    • origin: Departure location specification
      • locationType: Location type (currently supports "airports")
      • airports: Array of IATA airport codes (e.g., ["JFK", "LGA"])
    • destination: Arrival location specification (same format as origin)
    • date (string): Departure date in YYYY-MM-DD format
    • flex (string): Date flexibility (± days, usually "0" for exact dates)
  • passengers (array): Array of passenger type codes

    • "ADT": Adult (12+ years)
    • "CHD": Child (2-11 years)
    • "INF": Infant (<2 years)
    • "SNR": Senior (65+ years)
    • "YTH": Youth/Student
    • "STD": Student
  • cabin_class (string): Travel class preference

    • "economy": Standard economy
    • "premium-economy": Premium economy
    • "business": Business class
    • "first": First class

Optional Parameters

  • max_results (integer): Maximum number of flight results to return
    • Range: 1-10000
    • Default: Unlimited (returns all available flights)
    • Omit this field to get all available results

Output Format

The actor outputs comprehensive flight data with detailed leg and segment information:

{
"search_id": "abc123def456",
"result_id": "result_123456",
"price": 49999,
"total_price": 49999,
"currency_code": "USD",
"duration_minutes": 360,
"is_info_price": false,
"checked_bags_count": 1,
"carry_on_bags_count": 1,
"is_carry_on_prohibited": false,
"cheapest_provider_name": "American Airlines",
"cheapest_provider_booking_id": "AA123456",
"save_for_later_enabled": true,
"number_of_providers": 5,
"legs": [
{
"id": "leg_123",
"duration_minutes": 360,
"stops_count": 0,
"has_missing_segments": false,
"origin": {
"code": "JFK",
"name": "John F. Kennedy International Airport",
"city": "New York",
"city_code": "NYC"
},
"destination": {
"code": "LAX",
"name": "Los Angeles International Airport",
"city": "Los Angeles",
"city_code": "LAX"
},
"departure_time": "2025-12-01T08:30:00",
"arrival_time": "2025-12-01T11:30:00",
"airlines": [
{
"code": "AA",
"name": "American Airlines",
"logo_url": "https://..."
}
],
"segments": [
{
"id": "segment_456",
"flight_number": "AA123",
"airline": {
"code": "AA",
"name": "American Airlines",
"logo_url": "https://..."
},
"origin": {
"code": "JFK",
"name": "John F. Kennedy International Airport",
"city": "New York"
},
"destination": {
"code": "LAX",
"name": "Los Angeles International Airport",
"city": "Los Angeles"
},
"departure_time": "2025-12-01T08:30:00",
"arrival_time": "2025-12-01T11:30:00",
"duration_minutes": 360,
"equipment_type": "Boeing 737",
"transport_type": "CommercialFlight"
}
]
}
],
"search_timestamp": "2025-11-09T12:00:00",
"last_updated": "2025-11-09T12:00:00"
}

Location Types

Currently, the actor supports airport-based location specifications:

Airports Location

{
"locationType": "airports",
"airports": ["JFK", "LGA", "EWR"]
}

Usage Examples

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["JFK"]},
"destination": {"locationType": "airports", "airports": ["LAX"]},
"date": "2025-12-15",
"flex": "0"
},
{
"origin": {"locationType": "airports", "airports": ["LAX"]},
"destination": {"locationType": "airports", "airports": ["JFK"]},
"date": "2025-12-22",
"flex": "0"
}
],
"passengers": ["ADT", "ADT"],
"cabin_class": "economy"
}

One-way Business Flight with Flexibility

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["SFO"]},
"destination": {"locationType": "airports", "airports": ["ORD"]},
"date": "2025-11-15",
"flex": "3"
}
],
"passengers": ["ADT"],
"cabin_class": "business",
"max_results": 25
}

Multi-city Premium Economy

{
"legs": [
{
"origin": {"locationType": "airports", "airports": ["JFK"]},
"destination": {"locationType": "airports", "airports": ["CDG"]},
"date": "2025-12-01",
"flex": "2"
},
{
"origin": {"locationType": "airports", "airports": ["CDG"]},
"destination": {"locationType": "airports", "airports": ["FCO"]},
"date": "2025-12-10",
"flex": "1"
},
{
"origin": {"locationType": "airports", "airports": ["FCO"]},
"destination": {"locationType": "airports", "airports": ["JFK"]},
"date": "2025-12-20",
"flex": "2"
}
],
"passengers": ["ADT", "CHD"],
"cabin_class": "premium-economy"
}

Usage

  1. Deploy to Apify: Push this actor to your Apify account
  2. Configure Input: Use the input schema above or provided test files
  3. Run Actor: Execute with your flight search parameters
  4. Retrieve Results: Download comprehensive flight data from the dataset

Dependencies

  • apify>=1.0.0 - Apify SDK
  • httpx>=0.25.0 - Async HTTP client
  • dataclasses>=0.6 - Data structure support (Python 3.7+)
  • typing>=3.7.4 - Type hints support

Development

Project Structure

kayak-flights-scraper/
├── src/
│ ├── __init__.py
│ ├── main.py # Main actor implementation
│ └── kayak.py # Kayak API client
├── extract_flights.py # Flight data parser (reference)
├── input_schema.json # Input validation schema
├── output_schema.json # Output documentation schema
├── requirements.txt # Python dependencies
├── README.md # This documentation
├── gen_icon.py # Icon generation script
├── icon.png # Actor icon
├── input.json # Default test input
├── input_oneway.json # One-way test input
├── input_multicity.json # Multi-city test input
└── README_test_inputs.md # Test input documentation

Key Components

  • KayakFlightParser: Comprehensive parser for Kayak's API response format
  • FlightResult: Structured data model for flight results
  • FlightPollResponse: Complete response data structure
  • Single-request architecture: Efficient API usage with no polling

Error Handling

The actor includes robust error handling:

  • Network Errors: Automatic retries for connection issues
  • API Rate Limiting: Graceful handling of 403/429 responses
  • Server Errors: Retry logic for 5xx status codes
  • Data Validation: Comprehensive input validation
  • Parsing Errors: Graceful handling of malformed API responses
  • Session Management: Automatic session establishment and renewal

Logging

Detailed logging throughout execution:

  • API session establishment
  • Request/response logging
  • Data extraction progress
  • Error conditions and recovery
  • Performance metrics and statistics

Performance & Reliability

  • Single API Call: Efficient single-request architecture
  • Async Processing: Non-blocking I/O operations
  • Memory Efficient: Streams large result sets
  • Session Reuse: Maintains API sessions for reliability
  • Comprehensive Parsing: Handles complex nested flight data structures

Data Quality

  • Complete Flight Details: Airlines, segments, baggage, pricing
  • Airport Information: Codes, names, cities, station types
  • Real-time Data: Current pricing and availability
  • Multiple Providers: Coverage across booking providers
  • Structured Format: Consistent, queryable JSON output

Resources