General Purpose Web Scraping and Metadata Extraction avatar
General Purpose Web Scraping and Metadata Extraction

Pricing

$50.00/month + usage

Go to Store
General Purpose Web Scraping and Metadata Extraction

General Purpose Web Scraping and Metadata Extraction

Developed by

Jamshaid Arif

Maintained by Community

This project uses the Apify platform to scrape data from web pages, collect metadata, and store results in an Apify dataset. It features functions for managing date ranges, encoding identifiers, and handling large datasets, aiming to efficiently extract and store structured data for analysis.

0.0 (0)

Pricing

$50.00/month + usage

0

Monthly users

1

Runs succeeded

>99%

Last modified

3 months ago

Airbnb Data Scraper using Apify

This project is an Apify actor designed to scrape data from Airbnb property listings, including availability, pricing, and other details, over a given date range. The actor uses dynamic parameters for flexibility and stores the extracted data in Apify's dataset or a CSV file.


Features

  • Dynamic Date Range: Automatically generates check-in and check-out dates for the specified number of days.
  • Recursive JSON Parsing: Extracts all paths and values from the JSON responses for comprehensive data collection.
  • Data Storage: Pushes the extracted data to the Apify dataset or saves it locally as a CSV.
  • Configurable Inputs: Accepts various input parameters like URLs, stay duration, number of guests, and more.

Input Schema

The script accepts the following inputs via Apify:

ParameterDescriptionExample Value
startUrlsList of Airbnb listing URLs to scrape.[{ "url": "https://www.airbnb.com/rooms/12345" }]
checkInDateStarting date for the scraping."2024-11-21"
Stay_DaysDuration of each stay in days.1
numberOfDaysTotal number of days to scrape data for.60
adultsNumber of adults for the booking.2
childrenNumber of children for the booking.0
petsIndicates if pets are included in the booking.0

How It Works

  1. Dynamic Date Generator:

    • Generates check-in and check-out dates based on the input checkInDate, Stay_Days, and numberOfDays.
  2. Request Construction:

    • Encodes the Airbnb room ID in Base64 format.
    • Constructs GraphQL API requests with dynamically populated variables.
  3. Data Collection:

    • Sends GET requests to Airbnb's API for each listing and date range.
    • Extracts data paths and values using recursive JSON parsing.
  4. Data Storage:

    • Pushes the extracted data to the Apify dataset for further use.
    • Optionally saves data locally as a CSV file.

Output

The script outputs a dataset with the following fields:

FieldDescription
Check-In DateThe generated check-in date.
Check-Out DateThe corresponding check-out date.
PathJSON path of the extracted data.
ValueValue at the extracted JSON path.

Example Input

1{
2  "startUrls": [
3    { "url": "https://www.airbnb.com/rooms/12345" },
4    { "url": "https://www.airbnb.com/rooms/67890" }
5  ],
6  "checkInDate": "2024-11-21",
7  "Stay_Days": 1,
8  "numberOfDays": 10,
9  "adults": "2",
10  "children": "0",
11  "pets": "0"
12}

Logs

The script logs progress and errors to the console, including:

  • Current URL and date range being processed.
  • Any errors encountered during requests or data parsing.

Pricing

Pricing model

Rental 

To use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.

Free trial

1 day

Price

$50.00