591.com.tw Scraper | Taiwan Real Estate Data
Pricing
from $0.70 / 1,000 property listings
591.com.tw Scraper | Taiwan Real Estate Data
Extract commercial and residential property listings across Taiwan from 591.com.tw with unlimited coverage, rich listing detail, contact data, media, pricing, and community context. Built for enterprise-grade Taiwan real estate intelligence, deal sourcing and automated analytics pipelines.
Pricing
from $0.70 / 1,000 property listings
Rating
0.0
(0)
Developer
Fatih Tahta
Maintained by CommunityActor stats
1
Bookmarked
12
Total users
3
Monthly active users
2 days ago
Last modified
Categories
Share
591.com.tw Scraper | Taiwan Property Data
Slug: fatihtahta/591-taiwan-scraper
Overview
591.com.tw Scraper | Taiwan Property Data collects structured Taiwan property records from 591 search result pages, including sale, rental, commercial, land, and community-oriented property data. It captures listing identifiers, titles, public listing URLs, pricing, location details, property attributes, media, contact information, amenities, metrics, flags, and source context when available. 591.com.tw is one of Taiwan's major property marketplaces, making its public listing data useful for property research, market monitoring, lead review, and operational reporting. The actor turns repeatable 591 search URLs into consistent JSON records that can be used in analytics, enrichment, monitoring, and downstream data acquisition workflows. It is designed for dependable recurring collection patterns while reflecting the public data available at the time each run is executed.
Why Use This Actor
- Market research and analytics teams: build structured extraction workflows for pricing, availability, location coverage, property types, and market intelligence across Taiwan property segments.
- Product and content teams: maintain property catalogs, comparison experiences, local market pages, or internal review queues with normalized public listing records.
- Developers and data engineering teams: feed repeatable collection outputs into downstream systems, warehouses, search indexes, or enrichment pipelines with predictable JSON structure.
- Lead generation and enrichment teams: identify public listing opportunities, contact signals, location attributes, and property context for review and qualification workflows.
- Monitoring and competitive tracking teams: schedule recurring runs to observe listing movement, pricing changes, geographic coverage, and category-level activity over time.
Common Use Cases
- Market intelligence: monitor property supply, asking prices, unit prices, floor plans, districts, media volume, and listing availability across selected 591 result pages.
- Lead generation: build targeted prospect lists from public property listings that include contact names, contact methods, listing URLs, and location context when available.
- Competitive monitoring: track changes across sale, rental, commercial, land, or community result pages for recurring reporting.
- Catalog and directory building: populate internal databases with structured public property records, media references, pricing fields, and address metadata.
- Data enrichment: add current public listing attributes to CRM, BI, risk, underwriting, or analytics datasets.
- Recurring reporting: schedule periodic runs for dashboards, alerts, segment summaries, or operational reporting.
Quick Start
- Open 591.com.tw and create the search result page that matches your target segment, such as sale, rent, commercial, land, or community data.
- Copy the resulting 591 search URL and add it to
startUrls. - Set a small
limitfor the first validation run, such as 10 or 25 records per URL. - Choose whether
enrich_datashould collect richer listing details, contact context, amenities, photos, and related fields when available. - Run the actor in Apify Console and inspect the first dataset records to confirm the shape matches your workflow.
- Increase the limit, add additional URLs, or schedule the actor after the output is verified.
Input Parameters
The input schema accepts 591 search result URLs, an optional per-URL result limit, and an enrichment toggle.
| Parameter | Type | Description | Default |
|---|---|---|---|
startUrls | array of strings | Supported 591 result URLs to collect from. Add sale, rental, business, land, or community search pages as separate URLs. Use 591's public filters first, then paste the resulting search URLs here. | – |
limit | integer | Maximum number of records to save for each URL. Minimum value is 1. Leave empty to collect available results without a per-URL cap. | – |
enrich_data | boolean | When enabled, records may include richer public details such as descriptions, layout information, amenities, scores, transaction history, contact details, and additional photos. Disable it when search-level fields are enough. | true |
Choosing Inputs
Use startUrls as the primary way to define scope. Create the desired search on 591.com.tw first, including any public filters for location, segment, property category, price, area, or sort options, then add the resulting URL to the actor input. Narrower 591 search URLs produce more targeted datasets, while broader URLs improve discovery and may return a wider range of property records. For first runs, set a conservative limit so you can validate the output shape quickly before scaling collection. Use one segment or geography per URL when you need clean downstream segmentation; add multiple URLs when you want to collect several search scopes in one run.
Example Inputs
Sale Listing Validation Run
{"startUrls": ["https://sale.591.com.tw/?regionid=1&firstRow=0&shType=list"],"limit": 25,"enrich_data": true}
Rental Monitoring Run
{"startUrls": ["https://rent.591.com.tw/list?region=1§ion=5"],"limit": 100,"enrich_data": false}
Multi-Segment Discovery Run
{"startUrls": ["https://business.591.com.tw/list?type=1®ion=1&kind=5","https://land.591.com.tw/list?region=21&type=2&kind=11","https://market.591.com.tw/list?regionId=1§ionId=5&shopId=74"],"limit": 50,"enrich_data": true}
Output
9.1 Output Destination
The actor writes results to an Apify dataset as JSON records. The dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs with minimal post-processing.
9.2 Record Envelope And Stable Identifiers
Each dataset item is a normalized property_listing record. Top-level fields are stable buckets for provenance, identity, listing context, pricing, location, physical property attributes, availability, media, contact details, linked entities, metrics, and source-specific attributes.
The recommended idempotency key is record_id. Keep source_context.fingerprint and entity.url as supporting audit keys when reconciling repeated runs. Listing, community, housing, contact, and media identifiers are strings so downstream systems do not lose leading zeros or mix numeric IDs with measurements.
9.3 Examples
Example: sale listing (listing_type = "sale")
{"record_type": "property_listing","record_id": "19991644","source_context": {"source_name": "591 Taiwan","source_domain": "591.com.tw","seed_id": "1f94cb3feb36","seed_type": "url","seed_value": "https://sale.591.com.tw/?firstRow=0®ionid=1&shType=list","page_index": 1,"source_url": "https://bff-house.591.com.tw/v1/web/sale/list?category=1&firstRow=0®ionid=1&shType=list","listing_url": "https://sale.591.com.tw/home/house/detail/2/19991644.html","detail_url": "https://bff-house.591.com.tw/v1/web/sale/detail?id=19991644","external_ids": {"house_id": "19991644","id": "19991644"},"fingerprint": "a052dfe5e63e61c86d78"},"entity": {"title": "台北南港區投資型精品宅","description": "台北東區新門戶 快速抵達101信義商圈","url": "https://sale.591.com.tw/home/house/detail/2/19991644.html","external_ids": {"house_id": "19991644","id": "19991644"}},"listing": {"listing_id": "19991644","listing_type": "sale","refreshed_at": "34分鐘前","is_sponsored": false},"pricing": {"price": 1798,"show_price": "1,798","unit_price": "119.71萬/坪"},"location": {"address": "舊莊街一段145巷","region_id": 1,"region_name": "台北市","section_id": 11,"section_name": "南港區"},"property": {"area": 15.02,"floor": "5F/12F","house_age": 1,"kind": 9,"kind_name": "住宅","main_area": 0,"room": "2房1廳1衛","shape_name": "電梯大樓","show_house_age": "1年","amenities": {"condition_ids": ["9", "29", "43"]}},"media": {"photo_count": 11,"photo_url": "https://img1.591.com.tw/house/2025/07/01/175133721412285106.jpg!400x300.jpg"},"contact_details": {"contact_name": "屋主金小姐","contact_phone": "0972-528-589轉4083674"},"relationships": {"community": {"community_name": "依現場名稱"}},"metrics": {"browse_count": 10541,"community_sale_count": 0},"attributes": {"flags": {"has_carport": false,"is_down_price": false,"is_new": false,"is_vip": false},"source_specific": {"house_type": 1,"sale_type": 1,"operation_tag": {"type": 1,"title": "南港區 住宅收藏排名第2名"}}}}
Field Reference
record_type (string, required): Stable row discriminator. Normal 591 rows use property_listing.
record_id (string, required): Stable source-specific listing, community, housing, or fallback fingerprint identifier.
source_context (object, required): Provenance and audit context, including source name, source domain, source URL, listing URL, detail URL when enrichment is enabled, seed details, page index, external IDs, and fingerprint.
entity (object, required): Human-facing identity fields such as title, description, url, and external_ids.
listing (object, required): Listing business context such as listing_id, listing_type, deal_type, posted_at, refreshed_at, is_new_listing, is_featured, and is_sponsored.
pricing (object, optional): Price, display price, unit price, deposit, taxes, mortgage, rent includes, and other source-provided price details.
location (object, optional): Address, region, section, street, coordinates, transit, shopping district, and surrounding location context.
property (object, optional): Physical property details such as kind, shape, room layout, area, floor, age, land details, commercial details, parking, legal use, and property.amenities.
availability (object, optional): Move-in, minimum lease, and rental availability fields when present.
media (object, optional): Primary photo, photo URLs, picture details, videos, covers, avatars, 3D map images, floor plans, and media counts.
contact_details (object, optional): Public contact names, phones, emails, contact URLs, company labels, and preferred contact method.
relationships (object, optional): Linked real estate entities, including community/building details, agent records, agency/shop details, and related listing references when available.
metrics (object, optional): Browse counts, sale/rent counts, transaction counts, price counts, community scores, ranks, video plays, and other source-provided counts.
attributes (object, optional): Preservation bucket for meaningful source-specific data. This includes attributes.flags, attributes.market_activity, and attributes.source_specific.
Data Quality, Guarantees, And Handling
- Structured records: results are normalized into predictable JSON objects for downstream use.
- Best-effort extraction: fields may vary by region, session, availability, listing type, account visibility, and 591.com.tw interface changes.
- Optional fields: null-check optional fields in downstream code, especially media, contact, community, metric, and enriched detail fields.
- Deduplication: use
record_idas the recommended stable key, withentity.urlorsource_context.fingerprintas supporting keys when needed. - Freshness: results reflect the publicly available data at run time.
- Repeated runs: use the recommended idempotency key when syncing data into warehouses, CRMs, or search indexes.
Tips For Best Results
- Start with a small
limitto validate output shape before scaling up. - Use one geography, segment, or 591 search URL per run when you need cleaner segmentation.
- Leave the per-URL
limitempty only when you are ready to collect available records without a cap. - Create filters on 591.com.tw first, then paste the resulting URL into
startUrls. - Add additional URLs gradually so you can understand how each search scope changes coverage.
- Enable
enrich_datawhen you need detailed listing context; disable it when search-level fields are sufficient. - Use
record_idfor deduplication when storing results over time.
How to Run on Apify
- Open the Actor in Apify Console.
- Configure
startUrlswith the 591 result pages you want to collect. - Set the maximum number of outputs to collect per URL with
limit. - Choose whether to collect richer listing details with
enrich_data. - Click Start and wait for the run to finish.
- Download results in JSON, CSV, Excel, or other supported formats.
Scheduling & Automation
Scheduling
Automated Data Collection
Use Apify schedules to run the actor on a recurring basis and keep property datasets fresh for monitoring, reporting, and enrichment workflows. Scheduled runs are useful for recurring market snapshots, lead review queues, and listing-change analysis.
- Navigate to Schedules in Apify Console
- Create a new schedule, such as daily, weekly, or custom cron
- Configure input parameters
- Enable notifications for run completion
- Add webhooks for automated processing
Integration Options
- BI dashboards: monitor pricing, availability, geographic coverage, listing volume, and property attributes over time.
- Data warehouses: store normalized property records for historical analysis, modeling, and operational reporting.
- CRM enrichment: sync public listing, location, contact, and property attributes into account or lead records.
- Google Sheets or Airtable: review selected listings, maintain lightweight research queues, or share curated datasets with non-technical teams.
- Webhooks: trigger validation, ingestion, notification, or enrichment workflows after each completed run.
- Alerts and scheduled reports: notify teams when new records, pricing movement, or segment changes appear in recurring collections.
Export Formats And Downstream Use
Apify datasets can be exported from the run page or consumed by downstream systems. Use the format that matches your delivery and review workflow.
- JSON: for APIs, applications, and data pipelines
- CSV or Excel: for spreadsheet workflows and manual review
- API access: for automated ingestion into internal systems
- BI and warehouses: for reporting, dashboards, and historical analysis
Performance
Estimated execution times:
- Small runs (< 1,000 outputs): ~3-5 minutes
- Medium runs (1,000-5,000 outputs): ~5-15 minutes
- Large runs (5,000+ outputs): ~15-30 minutes
Execution time varies based on filters, result volume, and how much information is returned per record. Highly filtered runs can finish faster, while broad discovery or detail-rich records may take longer.
Limitations
- Availability depends on what https://591.com.tw publicly exposes at run time.
- Some optional fields may be missing on sparse records, older listings, or records without enriched public detail.
- Very broad searches may take longer or require higher limits.
- Target-side changes can affect field availability, naming, or visibility.
- Regional, account, listing type, or availability differences may change visible results.
- Contact, media, community, and metric fields should be treated as optional.
Troubleshooting
- No results returned: check that each
startUrlsvalue is a supported 591 search result URL and that the target page has matching public records. - Fewer results than expected: broaden the 591 filters, raise
limit, or verify that the target contains enough matching records. - Some fields are empty: optional fields depend on what each record publicly provides and whether enriched details are available.
- Run takes longer than expected: reduce scope, lower
limitfor validation, or split broad collection into smaller URL segments. - Output changed: compare the current output with the field reference and report a small sample if support is needed.
FAQ
What data does this actor collect?
It collects public 591.com.tw property records from supported search result URLs, including listing identifiers, titles, URLs, pricing, locations, property attributes, media, contacts, amenities, metrics, and enriched detail fields when available.
Can I filter by location, category, date, price, or other criteria?
Yes, when those filters are available on 591.com.tw. Apply the filters on 591 first, then paste the resulting search URL into startUrls.
Why did I receive fewer results than my limit?
The limit is a maximum per URL, not a guarantee. A run may return fewer records if the search page contains fewer matching public listings or if some records are unavailable at run time.
Can I schedule recurring runs?
Yes. Use Apify schedules to run the actor daily, weekly, or on a custom cron schedule for monitoring, reporting, and recurring enrichment workflows.
How do I avoid duplicates across runs?
Use record_id as the primary idempotency key. Keep entity.url and source_context.fingerprint as supporting keys when syncing records into warehouses, CRMs, or search indexes.
Can I export the data to CSV, Excel, or JSON?
Yes. Apify datasets support exports in JSON, CSV, Excel, and other formats from the run page.
Does this actor collect private data?
The actor is intended to collect publicly available property listing information from 591.com.tw. Users are responsible for using collected data lawfully and responsibly.
Should I enable enrichment?
Enable enrich_data when you need richer property, media, contact, amenity, or community context. Disable it for faster runs when search-level fields are enough.
What should I include when reporting an issue?
Include the input used, the run ID, expected versus actual behavior, and a small output sample when relevant. Redact any sensitive workflow details before sharing.
Compliance & Ethics
Responsible Data Collection
This actor collects publicly available Taiwan property listing information from https://591.com.tw for legitimate business purposes, including:
- Real estate research and market analysis
- Property monitoring and operational reporting
- Lead review, enrichment, and data quality workflows
This section is informational and not legal advice. Users are responsible for ensuring their use of the actor and collected data complies with applicable laws, regulations, contractual obligations, and platform terms.
Best Practices
- Use collected data in accordance with applicable laws, regulations, and the target site's terms
- Respect individual privacy and personal information
- Use data responsibly and avoid disruptive or excessive collection
- Do not use this actor for spamming, harassment, or other harmful purposes
- Follow relevant data protection requirements where applicable, including GDPR and CCPA
Support
For help, use the actor page or Issues section. Include the input used with sensitive details redacted, the run ID, expected versus actual behavior, and a small output sample if it helps illustrate the issue.