Real Estate Data Aggregator avatar

Real Estate Data Aggregator

Pricing

from $4.00 / 1,000 property scrapeds

Go to Apify Store
Real Estate Data Aggregator

Real Estate Data Aggregator

Redfin property data enriched with 45 government data fields from 14 free APIs — Census demographics, FEMA flood zones, FBI crime, Walk Score, HUD rent benchmarks, and more. USPS-normalized, confidence-scored, analysis-ready. From $4/1K properties.

Pricing

from $4.00 / 1,000 property scrapeds

Rating

0.0

(0)

Developer

Vitki Data

Vitki Data

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 hours ago

Last modified

Share

Real Estate Scraper + Government Enrichment

Redfin property data enriched with 48 government data fields from 13 free APIs — Census demographics, FEMA flood zones, FBI crime rates, Walk Score, HUD rent benchmarks, and more. Clean, validated, joinable data from the single most reliable US real estate source.

Returns validated records with USPS-normalized addresses, detail-page enrichment (price history, tax history, stories, garage), and joinable geographic identifiers (Census tract, FIPS, CBSA, Placekey). No paid external API subscriptions required — all enrichment fields use free government data sources.

What makes this different from other Redfin scrapers: The 48-field government enrichment layer. Every property gets Census demographics, flood risk, crime rates, mortgage data, employment statistics, and 10+ more datasets — all from free government APIs. No other Apify actor does this.

Need multi-source data with Zillow, Realtor.com, and cross-source validation? Visit vitkidata.com for the premium multi-source product.

Built for real estate investors, data analysts, and proptech developers who need clean, enriched property data without managing scraping infrastructure.

Need Zillow, Realtor.com, or additional sources? Visit vitkidata.com for our premium multi-source product with cross-source validation and confidence scoring.

What data do you get?

Each property record includes:

Data Source

SourceListing TypesProxyNotes
Redfinfor_sale, soldDatacenterPrimary source — most reliable, detail pages for stories/garage/history

Redfin is the default and recommended source. It provides the most reliable coverage with full detail-page enrichment (price history, tax history, stories, garage, date listed). Datacenter proxies work — no expensive residential proxy needed.

Additional sources (Zillow, Realtor.com, foreclosure, FSBO, and 7 more) are available as experimental options but may require residential proxies or local hardware. For reliable multi-source data with cross-source validation, visit vitkidata.com.

Property Fields

FieldExampleSource
Address (normalized)401 HONEYCOMB RDG, AUSTIN, TX 78746All sources
CountyTravisGeocodio
Listing price$530,000All sources
Sold price$510,000Redfin, Zillow (sold listings)
Price per sqft$250Calculated
Beds / Baths4 bd / 2.5 baAll sources
Square footage2,120 sqftAll sources
Lot size7,200 sqftRedfin, Zillow, Realtor.com
Year built2005All sources
Property typeSingle Family, Condo, Townhouse, etc.All sources
Listing statusfor_sale, pending, sold, coming_soonAll sources
Days on market15Redfin, Zillow, Realtor.com
Stories2 (building-level for condos)Redfin detail page (includeDetails=true)
Garage2Redfin detail page (includeDetails=true)
Date listed2025-03-15Redfin detail page (includeDetails=true)
Price historyListed $450K → Sold $440KRedfin detail page (includeDetails=true)
Tax history2024: assessed $380KRedfin detail page (includeDetails=true)
HOA monthly fee$350Redfin, Zillow
Coordinates (lat/lng)30.2672, -97.7431All sources + Geocodio
Census tract48453000700Geocodio (11-digit GEOID)
FIPS code48453Geocodio
CBSA code12420Geocodio (metro area)
Placekey226@627-s8y-qzkPlacekey API
MLS IDACT-2024-001Redfin, Realtor.com
Canonical key1800 LAVACA ST APT 207|AUSTIN|TX|78701Computed
Confidence scorehigh, medium, low, single_sourceCross-source analysis
Source URLsLinks to each source listingPer source

Government Enrichment Fields (48 fields from 13 APIs)

Requires includeGovernmentData=true. Each API degrades gracefully — missing keys or failures skip those fields.

FieldExampleSourceGeographic LevelKey Required
Median household income$92,941Census ACSTractNo
Median home value$537,800Census ACSTractNo
Vacancy rate0.066Census ACSTractNo
Population1,212Census ACSTractNo
Median age34.6Census ACSTractNo
Owner/renter occupied pct0.388 / 0.612Census ACSTractNo
Walk/Transit/Bike Score82 / 45 / 67Walk ScorePer-propertyWALKSCORE_API_KEY
Fair Market Rent (eff-4BR)$1,280–$2,480/moHUD FMRCountyHUD_API_TOKEN
Disaster count (10yr)3OpenFEMACountyNo
Flood disaster count1OpenFEMACountyNo
Last disaster date/type2023-08-15, HurricaneOpenFEMACountyNo
Flood zone + typeAE (high_risk)FEMA NFHLPer-propertyNo
Violent crime rate350.0/100KFBI CrimeCounty or stateFBI_API_KEY
Property crime rate1,800.0/100KFBI CrimeCounty or stateFBI_API_KEY
HPI 12-month change-2.3%FHFA HPIMetro (CBSA)No
ZIP median AGI$143,448IRS SOIZIPNo
Migration net flow+5,200IRS SOICountyNo
Mortgage approval rate0.786CFPB HMDACountyNo
Mean loan amount$312,000CFPB HMDACountyNo
Total employment885,790BLS QCEWCountyNo
Avg weekly wage$1,918BLS QCEWCountyNo
Median hourly wage$25.17BLS OESMetro (CBSA)BLS_API_KEY
30-year mortgage rate6.65%FREDNationalFRED_API_KEY
Brownfield within 1mifalseEPA FRSPer-propertyNo

Crime data note: FBI Crime rates are county-level when agency data is available (aggregated from FBI CDE ORI reports), otherwise state-level fallback. Check crime_data_granularity field per record. County rates may still differ from city rates (e.g., county=420 vs city=890). Use for regional comparison, not neighborhood-level analysis.

All addresses are normalized to USPS Publication 28 format with canonical keys for easy deduplication.

How much does it cost?

This actor uses Pay Per Event pricing. You only pay for what you use:

EventPriceDescription
Property scraped$0.004Per property returned — includes USPS-normalized address, deduplication, confidence scoring, and completeness validation
Enrichment applied$0.005Per property enriched with 45 fields from 14 free APIs — Census demographics, FEMA flood zones, FBI crime rates, Walk Score, HUD rent benchmarks, FHFA house prices, and more
Detail page fetched$0.003Per property detail page — adds price history, tax history, MLS description, parcel ID, and public records cross-validation
Search completed$0.001Per search query executed against Redfin for a location and listing type

Typical costs (per 1,000 properties):

  • Basic (search only, no details or enrichment): ~$4/1K
  • With details (details on, enrichment off): ~$7/1K
  • Full (details + enrichment, default settings): ~$12/1K

Platform compute costs (Apify CUs) are additional. A typical run of 100 properties uses 0.1 CU ($0.02-0.03). Redfin uses datacenter proxies — no expensive residential proxy needed.

Input parameters

ParameterTypeRequiredDefaultDescription
locationstringYesUS city name (Austin, TX), ZIP code (78701), or city + ZIP (Austin, TX 78701)
sourcesarrayNo["redfin"]Data sources. Redfin is the default and recommended source. Other sources (zillow, realtor, foreclosure, etc.) are experimental and may require residential proxies. For reliable multi-source data, visit vitkidata.com.
listingTypestringNofor_saleOne of: for_sale, sold, all, for_rent. Rental requests auto-route to Zumper. Sale/sold requests exclude Zumper (rental-only).
maxResultsintegerNo100Maximum properties to return (1-10,000)
skipCachebooleanNofalseBypass geocoding and Placekey cache. Forces fresh API lookups. Useful after address corrections.
sourceTimeoutintegerNo300Maximum seconds to wait for each data source (30-900). Sources that exceed this timeout are skipped gracefully.
includeDetailsbooleanNotrueFetch individual property detail pages for price history, tax history, listing date, description, stories, garage, HOA, and parcel ID. Adds 2-5 seconds per property.
includeGovernmentDatabooleanNotrueEnrich with 48 fields (35 data + 13 granularity metadata) from 13 free APIs: Census ACS demographics, Walk Score, HUD FMR, FEMA flood zones + disasters, FBI crime, FHFA HPI, IRS income, BLS employment, FRED mortgage rates, EPA brownfields, CFPB mortgage data. No paid API subscriptions required; optional free keys unlock Walk Score, HUD, FBI, BLS, FRED.
outputFormatstringNojsonOutput as json, csv, or both. CSV is delivered via Apify KV store (key OUTPUT_CSV) — only available when running on the Apify platform, not in local/CLI mode.
proxyConfigobjectNoApify proxyProxy settings. Datacenter proxies work for Redfin. Realtor.com requires residential proxies (auto-configured when using Apify proxy).

Supported locations

  • 186 major US cities by name (e.g., Austin, TX, Denver, CO, San Francisco, CA)
  • Any US ZIP code (e.g., 78701, 90210)
  • City + ZIP for precision (e.g., Austin, TX 78701)

Output example

Each record includes all fields shown below. Addresses are USPS Publication 28 normalized. The canonical_key field (format: STREET UNIT|CITY|STATE|ZIP) serves as a unique property identifier for deduplication and cross-dataset joins.

{
"address": {
"street": "1800 LAVACA ST",
"unit": "APT 207",
"city": "AUSTIN",
"state": "TX",
"zip": "78701",
"county": "Travis",
"canonical_key": "1800 LAVACA ST APT 207|AUSTIN|TX|78701",
"address_undisclosed": false
},
"coordinates": {
"lat": 30.2805833,
"lng": -97.7412883,
"accuracy_type": "rooftop",
"source": "search_result",
"match_type": null,
"is_unit_precise": false
},
"source_tier": "full",
"price": {
"listing": 299999,
"sold": null,
"per_sqft": 341,
},
"details": {
"beds": 2,
"baths": 2.0,
"sqft": 880,
"lot_sqft": 270,
"lot_acres": 0.006,
"year_built": 1966,
"property_type": "condo",
"stories": 1,
"garage": 1,
"hoa_monthly": 768.0,
"is_studio": false,
"unit_floor": 2
},
"listing": {
"status": "for_sale",
"days_on_market": 1,
"date_listed": "2026-03-31",
"date_sold": null,
"mls_id": "5286876",
"parcel_id": "01-2345-0207",
"parcel_id_type": "apn"
},
"price_history": [
{
"date": "2026-03-31",
"event": "Listed",
"price": 299999,
"source": "MLS",
"price_change": null,
"price_change_pct": null,
"price_change_null_reason": "no_prior_event",
"event_type": "sale"
},
{
"date": "2024-06-15",
"event": "Sold",
"price": 275000,
"source": "MLS",
"price_change": -10000,
"price_change_pct": -3.5,
"event_type": "sale"
}
],
"rental_history": [],
"tax_history": [
{
"year": 2025,
"assessed_value": 265000,
"tax_paid": 6825,
"source": "public_records"
},
{
"year": 2024,
"assessed_value": 250000,
"tax_paid": 6437,
"source": "public_records"
}
],
"tax_year_coverage": {
"start_year": 2024,
"end_year": 2025,
"populated_years": 2,
"total_years_known": 2,
"trimmed_rows": 0,
"missing_years": []
},
"sources": [
{
"source_name": "redfin",
"source_id": "173769598",
"source_url": "https://www.redfin.com/TX/Austin/1800-Lavaca-St-78701/unit-207/home/173769598",
"scraped_at": "2026-04-01T18:35:01.536935"
}
],
"data_quality": {
"overall_confidence": "single_source",
"source_count": 1,
"sources": ["redfin"],
"freshest_scrape": "2026-04-01T18:35:01.536935",
"discrepancies": [],
"is_valid": true,
"completeness_score": 0.9167,
"field_confidence": {
"listing_price": "single_source",
"price_per_sqft": "single_source",
"beds": "single_source",
"baths": "single_source",
"sqft": "single_source",
"year_built": "single_source",
"property_type": "single_source",
"status": "single_source",
"hoa_monthly": "single_source",
"garage": "single_source",
"parcel_id": "single_source",
"description": "single_source",
"date_listed": "single_source",
"lot_sqft": "single_source",
"stories": "single_source"
},
"value_provenance": {
"listing_price": {
"value": 299999,
"source": "redfin",
"source_authority": "mls",
"fetched_at": "2026-04-01T18:35:01.536935",
"confidence": "single_source"
},
"beds": {
"value": 2,
"source": "redfin",
"source_authority": "mls",
"fetched_at": "2026-04-01T18:35:01.536935",
"confidence": "single_source"
}
}
},
"census_tract": "48453000700",
"fips_code": "48453",
"cbsa_code": "12420",
"placekey": "22y@8t2-dxs-fcq",
"description": "Stunning downtown Austin condo with floor-to-ceiling windows...",
"government_data": {
"acs_granularity": "tract",
"bike_score": 87,
"brownfield_granularity": "property",
"brownfield_within_1mi": false,
"crime_data_granularity": "county",
"crime_data_year": 2023,
"disaster_count_10yr": 5,
"disaster_granularity": "county",
"flood_disaster_count": 1,
"flood_zone": "X",
"flood_zone_granularity": "property",
"flood_zone_type": "moderate",
"fmr_1br": 1562,
"fmr_2br": 1852,
"fmr_3br": 2264,
"fmr_4br": 2680,
"fmr_efficiency": 1280,
"fmr_granularity": "county",
"fmr_year": 2025,
"hmda_approval_rate": 0.786,
"hmda_granularity": "county",
"hmda_mean_loan": 312000,
"hpi_12mo_change": -2.3,
"hpi_granularity": "msa",
"irs_soi_granularity": "zip",
"last_disaster_date": "2023-08-15",
"last_disaster_type": "Hurricane",
"median_age": 40.9,
"median_home_value": 673500,
"median_household_income": 183808,
"migration_net_flow": 2683,
"mortgage_rate_30yr": 6.37,
"mortgage_rate_granularity": "national",
"oes_granularity": "msa",
"oes_median_hourly_wage": 25.29,
"owner_occupied_pct": 0.4084,
"population": 3923,
"property_crime_rate": 1369.7,
"qcew_avg_weekly_wage": 1891,
"qcew_granularity": "county",
"qcew_total_employment": 929235,
"renter_occupied_pct": 0.5916,
"transit_score": 67,
"vacancy_rate": 0.1091,
"violent_crime_rate": 292.9,
"walk_score": 95,
"walk_score_granularity": "property",
"zip_median_agi": 143448
},
"enrichment_failures": [],
"validation_warnings": [],
"schema_version": 10,
"build": "98efb2a"
}

Field notes:

  • canonical_key: Format is STREET UNIT|CITY|STATE|ZIP. Useful as a unique property identifier across datasets. Dedup Layer 1 matches on the address portion (without unit), then sub-groups by unit to preserve distinct condo units.
  • source_tier: "full" for MLS-connected sources (Redfin, Realtor.com, Zillow) with detail page enrichment, price/tax history, descriptions, and parcel IDs. "basic" for non-MLS sources (FSBO, foreclosure, government_reo, land_bank, epropertyplus, zeus_auction) with listing-level data only. Basic-tier records have higher null rates on detail fields — use completeness_score alongside this field to weight records appropriately.
  • price.sold: Only populated when listing.status is "sold". Always null for active listings.
  • fips_code: 5-digit county FIPS code (state + county). Example: "48453" = Travis County, TX (state 48 + county 453).
  • census_tract: 11-digit Census GEOID (state FIPS + county FIPS + tract code). Directly joinable to Census ACS data without transformation. Example: "48453000700" = Travis County, TX (state 48 + county 453 + tract 000700).
  • lot_sqft: For single-family homes, this is the lot area. For condos, this may represent a per-unit allocation of the building footprint (often very small or null). Treat lot_sqft as unreliable for condo records.
  • tax_history.assessed_value: This is the jurisdiction's official assessed value for tax purposes. In some states (e.g., Tennessee), assessed value is a fraction of appraised value (TN uses 25% for residential). Downstream consumers expecting full market-appraised value should check the state assessment ratio. A $340K listing with $83K assessed value in TN is consistent, not an error.
  • price_history / tax_history: Only populated when includeDetails=true. Price history contains listing, sale, and price change events. Tax history contains annual assessed values from public records. Both are empty arrays ([]) when detail pages are not fetched. Price history for condos may include building-level transactions (bulk sales) that don't represent individual unit values.
  • stories / garage / date_listed: Only populated when includeDetails=true. These fields come from Redfin detail pages, not the search API. Note: For condos and multi-unit buildings, stories is the building floor count (e.g., 39 for a high-rise), not the unit's interior layout. Single-family homes report the actual number of levels.
  • value_provenance: Per-field audit trail showing the resolved value, source, source authority (mls/public_records/derived), fetch timestamp, and confidence tier. Present for all 16 scored fields in the closed-key invariant. Enables programmatic auditing of any field's origin.
  • is_valid: true when the record passes all hard validation gates. false on records with impossible timelines, major unresolved conflicts, or critically low data quality. Invalid records are filtered from output by default.
  • government_data: Only present when includeGovernmentData=true. Contains 48 fields (35 data + 13 granularity metadata) from 13 free APIs. No paid external API subscriptions required — all use free government data sources, optionally unlocked by free-tier keys (Walk Score, HUD, FBI, BLS, FRED) that take ~5 minutes each to register:
    • Census ACS (tract-level): median income, home value, vacancy rate, population, median age, owner/renter pct. No key needed.
    • Walk Score (per-property): walk, transit, bike scores (0-100). Requires WALKSCORE_API_KEY.
    • HUD FMR (county-level): fair market rent by bedroom count. Requires HUD_API_TOKEN.
    • FEMA Disasters (county-level): disaster count (10yr), flood disasters, last disaster. No key needed.
    • FBI Crime (county or state-level): violent/property crime rates per 100K via FBI CDE agency aggregation (county when agency data available, state fallback). Check crime_data_granularity per record. Requires FBI_API_KEY.
    • FHFA HPI (metro-level): house price index 12-month change. No key needed (bulk CSV).
    • IRS SOI (ZIP/county-level): median AGI, migration net flow. No key needed (bulk CSV).
    • FEMA NFHL (per-property): flood zone code and risk classification. No key needed.
    • CFPB HMDA (county-level): mortgage approval rate, mean loan amount. No key needed.
    • BLS QCEW (county-level): total employment, average weekly wage. No key needed.
    • BLS OES (metro-level): median hourly wage. Requires BLS_API_KEY (free).
    • FRED (national): 30-year mortgage rate. Requires FRED_API_KEY (free).
    • EPA Brownfields (per-property): Superfund site within 1 mile. No key needed. Each API degrades gracefully — missing keys or API failures skip those fields without affecting others.
  • validation_warnings: Soft warnings about data quality issues detected during processing (e.g., unusually short addresses, implausible values that were corrected). An empty array means no issues detected.
  • overall_confidence: "single_source" for Redfin-only runs. Multi-source runs produce "high", "medium", or "low" based on cross-source agreement.
  • completeness_score: Tiered weighted score from 0.0 to 1.0. Required (weight 1.0): address, coordinates, price, beds, baths, sqft, property_type, status. Important (weight 0.5): description, year_built, HOA, parcel_id, date_listed, price/tax history. Enrichment (weight 0.2): census_tract, fips_code, cbsa_code, placekey, government_data. Fields that don't apply (e.g., beds/baths for land) are excluded from the denominator. Typical range: 0.67–0.95.

Use cases

  • Real estate investors: Screen properties across markets, build comp databases, track days on market
  • Data analysts: Bulk export to CSV for market trend analysis, price-per-sqft comparisons
  • Proptech developers: Feed property data into apps, CRMs, or investment calculators via Apify API
  • Market researchers: Compare listing inventory across ZIP codes and cities

Integrations

Results are stored in an Apify Dataset and can be:

  • Downloaded as JSON or CSV from the Apify Console
  • Accessed via the Apify API
  • Pushed to Google Sheets, Slack, Webhooks, or any HTTP endpoint via Apify Integrations

This actor extracts factual, publicly listed property data only. It does not extract copyrighted content (photos), proprietary valuations (Zestimates), or personal agent information. MLS descriptions are included as factual listing data. All output is filtered through a legal compliance layer before delivery.

Users are responsible for complying with applicable terms of service and local regulations when using scraped data.

Data accuracy disclaimer

This data is scraped from publicly available real estate listing websites and is provided "as-is" with no warranty of accuracy, completeness, or timeliness.

  • Not a substitute for MLS access, title search, professional appraisal, or licensed real estate advice.
  • Point-in-time snapshot — data may change between runs. Listings may be updated, removed, or repriced at any time on the source websites.
  • No guarantee of accuracy — field values (price, sqft, year built, etc.) are extracted from third-party sources and may contain errors from the source data, parsing differences, or site updates.
  • User assumes all risk for any decisions made based on this data. The actor developer is not liable for any losses arising from reliance on scraped data.

Always verify critical data points against official sources (county records, MLS, or direct contact with listing agents) before making financial decisions.

Privacy policy

What data is collected: This actor collects only publicly listed property data from real estate websites. It does NOT collect personal information about property owners, real estate agents, or end users of this actor.

Personal data handling:

  • Agent names, emails, and photos are actively stripped from output by the legal compliance filter (blocked fields list).
  • Owner names and contact information are never extracted.
  • No personal information about users of this actor is collected, stored, or shared.

Data retention:

  • Output data is stored in your Apify Dataset with the platform's default retention period (7 days for free tier, configurable on paid plans).
  • No long-term storage of scraped data is performed by this actor.
  • Run metadata and audit logs are stored in the run's KV store for the same retention period.

No sale of personal information: We do not sell, share, or disclose any personal information. This actor does not collect personal information in the first place.

CCPA / state privacy law compliance: Because this actor does not collect personal information as defined by CCPA (Cal. Civ. Code 1798.140(v)), the CCPA's consumer rights provisions (access, deletion, opt-out of sale) do not apply to the output data. If you believe your personal information has been inadvertently included in the output, contact us and we will investigate and remove it.

Contact: For data subject requests, privacy inquiries, or to report inadvertently included personal data, email contact@vitkidata.com or open an issue on the actor's GitHub repository.

Limitations & known issues

Source-specific

  • 350 results per location. Redfin limits search results to 350. For larger datasets, run multiple ZIP codes in the same metro area.
  • Rental listings not supported. Redfin only serves for_sale and sold listings. Rental data is available through experimental sources (Zumper).
  • Redfin address aliasing. Some Redfin listings use a different address in the URL than the canonical property address (e.g., URL says "311 Tennessee St" but the property is "335 Riverbluff Pl"). This can cause the detail page response to return incomplete data — specifically, tax_history and sometimes price_history may be empty on these records even when other records from the same run have full history. The source URL will not match address.street on affected records.
  • Basic-tier source limitations. Non-MLS sources (source_tier: "basic") — FSBO, foreclosure, government_reo, land_bank, epropertyplus, zeus_auction — provide listing-level data only. Expect null values for: description, parcel_id, mls_id, date_listed, days_on_market, price_history, and tax_history. The completeness_score field accurately reflects this (~0.74 vs ~0.92 for full-tier records).

Data quality

  • Listing prices are MLS snapshots. Prices reflect the MLS data at scrape time. Properties that are re-listed (new MLS ID, new price) after scraping will show stale prices until the next run. This is inherent to scraping — there is no real-time feed. For investment decisions, verify current pricing independently. The sources[].scraped_at timestamp shows when each record was captured.
  • price.listing equals price.sold on sold records (search only). Without includeDetails=true, Redfin's search API provides only the final sale price for both fields. With details enabled, the original listing price is extracted from price history.
  • ~5-10% geocoding gap. Geocodio cannot resolve some addresses (highway exits, PO boxes, undisclosed). These records have null census_tract, fips_code, and county. Source coordinates are still included.
  • Crime data is county or state-level. FBI crime rates are aggregated from agency ORI reports (county-level when data available, state-level fallback). Check crime_data_granularity field per record. May differ from city-level rates. Source: FBI Crime Data Explorer API. Ref: docs/official/enrichment/fbi-crime-data-explorer.md.
  • Condo lot_sqft. For condos, lot_sqft may represent the building parcel area, not the individual unit. Treat as unreliable for condo records.

Enrichment coverage by market size

  • Metro areas (pop >250K): All 48 government fields available.
  • Small metros/micropolitan: HPI (FHFA) and OES wages (BLS) may be null — these APIs cover ~100-530 MSAs only.
  • Rural areas: Transit score, HPI, and OES wages will be null. This is expected — these APIs don't cover rural markets.

Usage notes & rate limits

  • Geocodio: Free tier allows 2,500 lookups per day. The actor uses ~1 lookup per property for geocoding enrichment. Runs exceeding ~2,200 properties per day may see reduced enrichment (census tract, FIPS code, county will be null for records exceeding the quota). Large runs should be spread across multiple days for full enrichment.
  • Placekey: Free tier allows 100,000 lookups per month. Not typically a bottleneck for most use cases.
  • Residential proxies: Realtor.com requires residential proxies, which cost more than datacenter proxies. Apify auto-configures this when you select the realtor source. Redfin works with cheaper datacenter proxies.
  • Run scheduling: For recurring data collection, consider scheduling runs during off-peak hours and spreading large datasets across multiple runs to stay within enrichment API limits.

FAQ

Q: What does includeGovernmentData do? When includeGovernmentData=true, the actor enriches each property with 48 fields (35 data + 13 granularity metadata) from 13 free APIs: Census ACS (demographics), Walk Score, HUD FMR (rent benchmarks), FEMA (disasters + flood zones), FBI Crime, FHFA HPI (price appreciation), IRS SOI (income + migration), CFPB HMDA (mortgage data), BLS QCEW (employment), BLS OES (wages), FRED (mortgage rate), and EPA (brownfield proximity). No paid API subscriptions required. Optional free keys: WALKSCORE_API_KEY, HUD_API_TOKEN, FBI_API_KEY, BLS_API_KEY, FRED_API_KEY (all free registration, ~5 minutes each). Each API degrades gracefully — missing keys skip those fields. Data is cached by geography.

Q: What does includeDetails do? When includeDetails=true, the actor fetches each property's individual Redfin detail page after the initial search. This unlocks: price history (listing, sale, and price change events), tax history (annual assessed values), date listed, stories, garage info, and the original list price on sold records. This adds 2-5 seconds per property to the run time and costs ~$0.003 per property in PPE charges.

Q: How fast is it? A typical single-source run of 100 properties completes in under 30 seconds. With includeDetails=true, add 2-5 seconds per property (100 properties takes ~4-9 minutes). Multi-source runs (3-5 sources) take 2-3 minutes due to residential proxy negotiation and rate limiting. Sources run in parallel.

Q: Can I get more than 350 properties per location? Each search returns up to 350 properties per location (a Redfin platform limit). For larger datasets, run multiple searches with different ZIP codes in the same metro area.

Q: Which sources should I use? The default (["redfin"]) is recommended. Redfin provides the most reliable coverage with full detail-page enrichment. Other sources are available as experimental options but may be unreliable on the Apify platform. For multi-source data with Zillow and Realtor.com, visit vitkidata.com.

Q: Do I need residential proxies? No. Redfin works with datacenter proxies, which are included with Apify's default proxy configuration. No extra setup needed.

Q: What happens if a source fails during a run? The actor uses graceful degradation. If one source fails (blocked, timeout, error), the run continues with data from the remaining sources. You'll see a warning in the run metadata indicating which source failed and why. You are not charged for failed sources.

Q: What happens if a location isn't recognized? If a city isn't in the lookup table, use a ZIP code instead. All US ZIP codes are supported.