Craigslist Scraper avatar

Craigslist Scraper

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Craigslist Scraper

Craigslist Scraper

Scrape Craigslist search results and individual posts across any city subdomain (sfbay, newyork, chicago, etc.). Extracts titles, prices, descriptions, attributes, coordinates, images, and posted/updated timestamps. HTTP-only, no login, no proxy required.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(16)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

16

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape Craigslist search results and individual posts across any city subdomain — sfbay, newyork, chicago, losangeles, london, and hundreds more. Returns titles, prices, descriptions, attributes, coordinates, images, and posted/updated timestamps. HTTP-only; no login, no cookies, no proxy required for normal volume.

Output (per post)

  • type = craigslist_listing
  • url, id (numeric post id parsed from <id>.html)
  • title, category, subcategory, location, region
  • postingType — category slug from the URL path (cto, apa, lab, fud, ...)
  • price, priceQualifier — e.g. / 2br - 1083ft², OBO, / month
  • postedAt, updatedAt
  • description — plain-text #postingbody
  • images — large image URLs (_600x450.jpg) from the gallery
  • latitude, longitude — from the embedded map
  • mapAddress — street address displayed next to the map
  • attributes — raw dict parsed from .attrgroup (full key → value set)
  • phoneNumbers, notices, replyUrl
  • Jobs: compensation, employmentType, jobTitle
  • For-sale (generic): condition, size
  • Autos: yearManufactured, makeManufacturer, modelName, odometer, transmission, fuelType, titleStatus, cylinders, paintColor, drive, vehicleType, vin
  • Housing: bedroomCount, bathroomCount, sqft, availableDate, housingType, laundry, parking, rentPeriod, listedBy, applicationFee, brokerFee, openHouseDates, catsOk, dogsOk, furnished, smoking
  • scrapedAt

Fields that are absent on the source page are simply omitted (no nulls). If zero posts are scrapeable, a single craigslist_blocked sentinel is emitted so the run exits with data.

Input

FieldTypeDescription
startUrlsobject[]Craigslist search URLs or direct post URLs. Prefill: https://sfbay.craigslist.org/search/jjj.
searchTermstringOptional keyword. Appended to each search URL as ?query=<term>. Ignored for direct post URLs.
maxItemsintegerMaximum posts per run. Default 3. Max 1000.
scrapeDetailsbooleanFetch each post's page for full description + attributes. Default true.
minPriceintegerMinimum price filter (USD).
maxPriceintegerMaximum price filter (USD).
hasImagebooleanWhen true, only include posts with at least one image.
proxyConfigurationobjectApify proxy config. Default off — Craigslist accepts datacenter IPs.

How it works

  1. For each startUrls entry, the scraper classifies the URL as either a search page (/search/<cat>) or a direct post (.../<id>.html).
  2. Search pages are paginated by incrementing the s=N query offset; post URLs are extracted from <li class="cl-static-search-result"> cards.
  3. For each post URL (deduplicated by numeric id), the detail page is fetched and parsed:
    • Title from #titletextonly
    • Price from .price
    • Description from #postingbody
    • Attributes from .attrgroup (label → value pairs)
    • Coordinates from .mapbox[data-latitude]
    • Images from #thumbs a[data-imgid] (large _600x450 URLs)
    • Posted / updated timestamps from .postinginfos time[datetime]
  4. Phone numbers and warning notices are extracted via regex / DOM.

FAQ

Do I need a proxy? No. Craigslist responds fine from Apify's datacenter IPs. If you hit a 403 wall, toggle on Apify proxy.

Does this bypass login walls? Craigslist has no login wall for public listings — this actor works out of the box.

Can I scrape just one specific post? Yes. Put the direct post URL (ending in <id>.html) into startUrls and set maxItems: 1.

What cities are supported? Every Craigslist subdomain — the scraper detects the region automatically from the URL.