Airbnb Listing Details Scraper
Pricing
from $1.50 / 1,000 results
Airbnb Listing Details Scraper
Paste any Airbnb /rooms/<id> URL and get every PDP field: photos, amenities, host info, ratings, house rules, location, pricing, and more. Fast Cheerio scraper, no browser, undercuts the market.
Pricing
from $1.50 / 1,000 results
Rating
3.2
(2)
Developer
Crikit
Maintained by CommunityActor stats
0
Bookmarked
21
Total users
18
Monthly active users
0.83 hours
Issues response
12 days ago
Last modified
Categories
Share
Paste any Airbnb /rooms/<id> URL (or just a listing ID) and get back every field on the public PDP: title, description, host info, full photo list, every amenity, ratings + review category breakdown, house rules, location, pricing, availability and more. Pure HTTP, no headless browser — built on Airbnb's own persisted-GraphQL endpoint, so it's roughly 8× faster and 8× cheaper than a Playwright stack and undercuts every other paid scraper on the Store.
What you give it
| Field | Type | Required | Notes |
|---|---|---|---|
startUrls | array of {url} | one of | Airbnb /rooms/<id> URLs. |
listingIds | array of strings | one of | Numeric listing IDs (we'll build the URL). |
maxItems | integer | no | Hard cap. Leave empty for no cap. |
maxConcurrency | integer | no | Default 10. The GraphQL transport handles 15+ comfortably. Lower it only if your selected proxy pool throttles. |
includeRawPayload | boolean | no | Debug only. Adds ~150KB per item. |
includeInlineReviews | boolean | no | Default true. Adds the first ~6 reviews from the listing's review endpoint. Disable to skip the extra request per listing. |
inlineReviewsCount | integer | no | Default 6, max 50. Ignored when includeInlineReviews is off. |
proxyConfiguration | object | no | Defaults to Apify Proxy automatic mode. You can select a specific proxy group or supply your own URLs. |
checkIn / checkOut / adults / children / infants / pets | various | no | When supplied, the scraper passes them to the listing query so available, priceLabel, priceBreakdown, minNights etc. populate. Without dates, those fields are null (Airbnb only returns pricing for dated quotes). |
currency / locale | string | no | Default USD / en-US. The GraphQL endpoint converts server-side. |
What you get back
One thorough JSON record per listing (79 fields per record), structured into:
- Core:
id,propertyId,url,canonicalUrl,title,description,descriptionHtml,descriptionSections[],propertyType,roomType,homeTier,personCapacity,bedrooms,beds,bathrooms,bathroomShared - Location:
coordinates {lat, lng},locationSubtitle,locationTitle,locationDetails[],mapMarkerType,mapMarkerRadiusMeters,locationVerification,defaultZoomLevel - Host:
host.id(numeric, decoded from Airbnb's opaque ID),host.name,host.profilePictureUrl,host.isSuperhost,host.isVerified,host.about,host.highlights[],host.responseTime,host.metrics[](years hosting, reviews, rating),host.yearsHosting,host.coHosts[],host.superhostTitle,host.superhostText - Photos: ordered
photos[]array with{id, url, accessibilityLabel, aspectRatio, orientation, roomCategory, position}, pluscoverImageandpreviewPhotos[] - Amenities:
amenities[]flat list with{id, title, subtitle, available, icon, category}, plusamenityGroups[]keeping Airbnb's category structure - Highlights:
highlights[](e.g. "Top 5% of homes", "Self check-in") - Sleeping arrangements:
sleepingArrangements[]with per-room title, subtitle, icons, images - Accessibility:
accessibilityFeatures[]grouped by area - House rules:
houseRules[]quick list,houseRulesGrouped[]full breakdown,additionalHouseRulesfree text - Safety & property:
safetyAndProperty[]preview +safetyAndPropertyGrouped[]full breakdown - Policies:
cancellationPolicy {title, forDisplay},propertyLicenses[],importantInformation[] - Reviews:
reviewCount,overallRating,ratingCategories[](cleanliness / accuracy / check-in / communication / location / value),ratingDistribution[](5–1 stars),reviewTags[],isGuestFavorite,qualityScorePercentile, andinlineReviews[](first 6 by default — useairbnb-review-scraperfor full pagination) - Pricing & availability (when you supply dates):
available,canInstantBook,petsAllowed,maxGuestCapacity,priceLabel,priceOriginal,priceDiscounted,priceQualifier,priceBreakdown,unavailabilityMessage,priceDisclaimer,currency,minNights,maxNights,availabilityItems[] - SEO:
seoFeatureswithcanonicalUrl,seoTitle,seoMetaDescription,breadcrumbs[],nearbyCities[],otherPropertyTypes[],ogTags,indexInSearchEngines - Meta:
pdpType(MARKETPLACE/LUXE/PLUS/HOTEL_TONIGHT),pdpUrlType,isHotelStay,isLuxe,locale,scrapedAt
If a field is genuinely missing from the listing, the value is null.
Performance
Measured against a corpus of 180 fresh listings spread across Paris, Tokyo, New York, London, Mexico City, Bali, Cape Town, Reykjavik, Buenos Aires, and Lisbon (3 concurrent variance runs, 540 total fetches):
| Metric | Value |
|---|---|
| Success rate | 99.8% (full extractionStatus on 539 of 540 fetches) |
| Throughput | ~1.5 listings/sec at concurrency 15 |
| Cost per listing | ~$0.00046 (1.5–2× under the cheapest competitor) |
| Single-listing latency | ~1.0–1.3 s |
When a listing fails the primary GraphQL path (typically Akamai flagging a specific proxy IP for that listing), the actor automatically falls back to the SSR HTML PDP via a fresh session.
Pricing
This actor is designed for Apify result pricing — each successful listing pushed to the default dataset is one billable result. Per-listing errors are stored as rows with extractionStatus: 'error'; if a run produces no listing results at all, the actor pushes one run-level error row so the attempt is still visible.
Worked examples at $0.0015 per result:
- 100 listings: $0.15
- 1,000 listings: $1.50
- 10,000 listings: $15.00
Cheaper than red.cars/airbnb-scraper at $0.005/item (70% less) and undercuts tri_angle/airbnb-scraper's flat $30/month for everything except the heaviest bulk users.
How it works
We call Airbnb's StaysPdpSections persisted-GraphQL endpoint directly with the public API key Airbnb hardcodes into every page (X-Airbnb-API-Key: d306zoyjsyarp7ifhu67rjxn52tv0t20). One HTTP request → one full JSON record. The persisted query asks for every fragment the server can produce so the response includes all 30 PDP sections (title, hero, amenities, description, location, host, policies, reviews summary, etc.).
When a listing's primary path is blocked by Akamai on a specific proxy IP, we automatically retry via the SSR HTML /rooms/<id> path with a fresh session. That rescue is what closes the gap from ~98% to 99.8%.
No headless browser, no cookies, no CSRF dance. That's why we can charge under a fifth of a cent per listing.
Limits to be aware of
- Deleted / migrated listings return a
{ "extractionStatus": "notfound", ... }row. Airbnb serves a 200 OK with a 404 page or aNOT_FOUNDGraphQL error for these — we detect both and push the row without retrying. - Date-specific pricing fields only appear when Airbnb has dates to quote against. Pass
checkIn/checkOut(and any guest counts) in the input to populateavailable,priceLabel,priceBreakdown,minNights,maxNights, etc. - Inline reviews are limited to the first
inlineReviewsCount(default 6) of the most-recent reviews. For full pagination use our companionairbnb-review-scraper. - Some fields are SSR-only on Airbnb's side —
propertyId,utcOffset,bedType, the eventDataLoggingamenityIdsCSV, and SEO breadcrumbs only populate when the request comes through SSR HTML. When the API path serves the listing they come back null. If you need those specifically, enable a checkbox option or call the actor withproxyConfigurationthat forces a residential profile. - Airbnb redirects
.comto the visitor's regional domain (.ca,.co.uk, etc.). The GraphQL endpoint works on every domain identically; onlylocale=andcurrency=query params change. - Proxy: Apify Proxy automatic mode is the default. The GraphQL endpoint is much more forgiving than the HTML path, so even datacenter proxies work — residential is only required if you hit the HTML fallback hard.
Plugging in your own proxy
In the proxyConfiguration input, set:
{"proxyConfiguration": {"proxyUrls": ["http://user:pass@residential.brightdata.com:22225","http://user:pass@gate.smartproxy.com:7000"// ...add as many as you want; Crawlee will round-robin]}}
FAQ
Why is this cheaper than the alternatives? Most other Airbnb scrapers use Playwright (a real browser) or paginate slow HTML pages. We hit Airbnb's own GraphQL backend that the niobe client uses — one HTTP request per listing, ~1 second wall time, no anti-bot stack to fingerprint. It's roughly 8× faster and 8× cheaper.
Can I get full reviews, not just the inline ones?
Use airbnb-review-scraper (same author). It paginates the persisted GraphQL endpoint and pulls every review with date filtering.
Can I scrape a search URL?
Use airbnb-search (same author). Paste any Airbnb search URL or location and get every result.
How fresh is the data? Every run hits the live API. Airbnb's CDN caches identical requests for ~5 minutes, so the worst-case freshness is 5 minutes; new bookings, calendar changes, and review edits show up almost in real time.
What if Airbnb rotates the persisted query hash? The actor bootstraps a fresh hash from listing 12937's HTML on startup, so a hash rotation auto-recovers on the next run.
Changelog
0.1.13— Massive QA pass. New GraphQL transport viaStaysPdpSectionspersisted query. Reliability went from ~12% on non-cached listings to 99.8% across 540 fetches. ~1.5 listings/sec throughput, ~$0.00046 / listing cloud cost. Added HTML fallback for the rare listing flagged on the API path. Decoded host IDs to numeric form. Memory class trimmed to 512 MB. Dependency pins tightened.0.1.8— Detect Airbnb's 404 page directly and push a{"extractionStatus": "notfound"}row instead of retrying 5×. Test fixtures replaced with currently-active listing IDs.0.1.7— Multi-listing runs with partial success now exitSUCCEEDED(wasFAILED). Empty / unparseable input pushes a clean error row instead of throwing.0.1— Initial release.