3 days trial then $120.00/month - No credit card required now

Archived1 Scraper

dhrumil/archived1-scraper

3 days trial then $120.00/month - No credit card required now

Scrape crawl millions of sale/rent real estate properties. Our real estate scraper also lets you monitor specific listing for new updates/listing. You can provide multiple search result listings to scrape/monitor.

🏡 What is Real Estate Properties Scraper?

This properties Scraper will enable you scrape any sale/rent listing.

You can simply take your listing url from browser and enter it into this actor. This actor will crawl through all pages of particular listing and generate dataset for you.

Listing url is something you get when you perform the search on site.

🚪 What can this Scraper do?

📈 Extract market data listings

👀 This actor is not just scraper but also has monitoring capability. You can turn on monitoring mode and it will give you only newly added properties compared to your previous scrapes.

📩 This actor also helps yu to identify which properties are not listed anymore. Please refer to Identifying delisted properties

⬇️ Download real estate data in Excel, CSV, JSON, and other formats

🌳 What data can I extract using this tool?

📝	📝
Listing Title	Full Address
Listing URL	ReferenceNo
reraDedLicenceNo	Completion Status
Bathrooms	Bedrooms
Agent Name	Agent Phone
Listing Type	Property Type
Latitude	Longitude
Completion	Agency Email
Text Description	Formatted HTML Description
Amenities	Images
Price	Size
Furnishing	Listing Date

⬇️ Input

For simple usecase, you just need to provide browser url search result page & that's all. You can leave other fields as they are to be sensible defaults.

Input example

1{
2    "listUrls": [
3        {
4            "url": ""
5        }
6    ],
7    "propertyUrls": [
8        {
9            "url": ""
10        }
11    ],
12    "fullScrape": true,
13    "monitoringMode": false,
14    "includePriceHistory": false,
15    "enableDelistingTracker" : false
16}

You can either provide listUrls to search properties from or provide propertyUrls directly to crawl.

Understading monitoring mode :

fullScrape : This option is by default turned on. When enabled it always force actor to scrape complete listing from all pagination pages regardless of monitoring is enabled or not.
monitoringMode : This option when turned on will only scrape newly added property listings compared to previously scraped properties by this actor. It's important to turn off fullScrape setting if you are using this mode. If you keep fullScrape on, it will re-scrape complete listing again.
includePriceHistory : This option when turned on will also scrape procie history of given property when available. This may affect the speed of scraping considerably. Please turn it on only if you need this data.
enableDelistingTracker : This option when turned on will start tracking date against each property under Apify Key Value store. This KV store can be queried later to find out which properties are delisted.

⬆️ Output

The scraped data is stored in the dataset of each run. The data can be viewed or downloaded in many popular formats, such as JSON, CSV, Excel, XML, RSS, and HTML.

Output example

The result for scraping a single property like this:

1{
2	"title": "Super Deluxe Villa for sale in Helwan \\ corner on two streets, the second piece of the main street",
3	"url": "",
4	"images": [],
5	"price": 3000000,
6	"bedrooms": 5,
7	"bathrooms": 8,
8	"size": 8000,
9	"coordinates": {
10		"lat": 25.348499298096,
11		"lng": 55.416801452637
12	},
13	"location": "Al Ghubaiba, Sharjah, UAE",
14	"type": "sale",
15	"description": "Super Deluxe Villa for sale in Helwan area in Sharjah <br> The land area is 8000 square feet, a corner on two streets, the second piece of the main street <br> The building area is 4600 square feet<br> The villa consists of:<br>5 master rooms, each room has a private bathroom with a dressing <br>4 on the top floor and one room on the ground floor <br>Two halls <br>big sitting room <br>A kitchen with two doors, internal and external. <br>A maid's room and an ironing room have an external door <br>The materials used in the villa are first class <br>Air conditioning duct general <br>Interlock from Arabic German does not change color <br><br>3 million dirhams required",
16	"isVerified": false,
17	"hasDLDHistory": false,
18	"dldValidationUrl": null,
19	"addedOn": "November 6, 2023",
20	"agent": "mohamed",
21	"agencyName": "Al Wasl Estste",
22	"agencyEmail": "alwaslrealestate112@gmail.com",
23	"amenities": [
24		"Maids Room",
25		"Central A/C & Heating",
26		"Balcony",
27		"Private Garden",
28		"Pets Allowed",
29		"Double Glazed Windows",
30		"Laundry Room",
31		"Broadband Internet",
32		"Satellite / Cable TV",
33		"Maintenance Staff",
34		"Storage Areas",
35		"Waste Disposal"
36	],
37	"propertyType": "Villa",
38	"purpose": "Sale",
39	"furnished": "Unfurnished",
40	"updatedAt": "1699994557",
41	"completionStatus": "Ready",
42	"reraDedLicenceNo": "511946",
43	"propertyReference": "9873-L0fN5K",
44	"id": "2023-11-5-super-deluxe-villa-for-sale-in-helwan-corn-12-569"
45}

❓Limitations

Since allows only 80000 properties per listing/search result, you might want to break down your listing urls into smaller area if it has more than 80K results. Good News is that even if multiple list urls contains overlapping results, they will get deduplicated within same run data.

🔎 Identifying delisted properties

This actor provides you monitoring mode configuration using which you can get only incremental updates about newly added properties. In case, you also want to identify which properties are delisted from platform, you can use any of the following techniques with the help of this actor.

Running Always in full scraper mode : Run this actor always in full scrape mode and cross check the new incoming batch of data with your existing database. If any property that exists in yoru database but not in newly scraped data batch, that means it's not listed anymore
Use Key Value Store generated by scraper : If your are monitoring very large batch of data and you don't want to scrape everything all the time, this method involves bit of technicality but achieves the goal efectively. Apify has storage feature called Key-value store. When you run this scrape, this scraper stores every single property in key value store along with timestamp in store. Inside this store, key is property id itself and value is timestamp like this
```
{ lastSeen : '2023-11-02T05:59:25.763Z'}
```
Whenever you run this scraper, it will update the timestamp against particular id if it finds property on the platform. e.g. if we have 2 proprties with id prop1 and prop2 and we scraped them both on November 1, key value storage would look like this :
```
1prop1 -> { lastSeen : '2023-11-01T05:59:25.763Z'}
2prop2 -> { lastSeen : '2023-11-01T05:59:25.763Z'}
```
Now if you run this scraper again on December 1 and prop1 is not on the platform anymore but prop2 is still there, key value storage would change like this :
```
1prop1 -> { lastSeen : '2023-11-01T05:59:25.763Z'}
2prop2 -> { lastSeen : '2023-12-01T05:59:25.763Z'}
```
That means if any property has lastSeen less than latest batch you loaded, that property is delisted now. You can directly iterate through whole Key value storage using Apify key value storage API to identify this. Please refer to this API documentation to do the same. Please remember store name generated by this scrape will be.

Alternatively, you can iterate through your existing database active properties and use this API to identify listing status.

For this approach to work, it's important that you enable this feature via enableDelistingTracker (Enable Delisting tracker) input.

🙋‍♀️ For custom solutions

In case you need some custom solution, you can contact me : dhrumil@techvasu.com

Or learn more about me on github : https://github.com/dhrumil4u360

Developer

Dhrumil Bhankhar

Actor metrics

2 monthly users
1 star
100.0% runs succeeded
Created in Jan 2024
Modified 4 months ago

Categories

Real estate

Automation

Business

Favicon Scraper & Archiver

embion/favicon-scraper-archiver

Automatically discover, download, and archive favicons from a list of websites. Ensuring you get the icons you need in a clean and organized manner. Supported formats: SVG, PNG, ICO

Embion

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify

23.9k

566

Google Maps Extractor

compass/google-maps-extractor

Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

13.8k

244

Google Maps Scraper

compass/crawler-google-places

Extract data from hundreds of Google Maps locations and businesses. Get Google Maps data including reviews, images, contact info, opening hours, location, popular times, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

76.8k

497

🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper

apidojo/tweet-scraper

⚡️ Lightning-fast search, URL, list, and profile scraping, with customizable filters. At $0.30 per 1000 tweets, and 30-80 tweets per second, it is ideal for researchers, entrepreneurs, and businesses! Get comprehensive insights from Twitter (X) now!

API Dojo

7.1k

248

Facebook Events Scraper

apify/facebook-events-scraper

Facebook Events Scraper extracts data such as event name, location, description or number of users who are interested. You can use URLs of specific events or come up with search queries and explore pretty much unlimited number of events. Search queries can be combined with various search filters.

Apify

826

Facebook Ads Scraper

apify/facebook-ads-scraper

Extract advertising data from one or multiple Facebook Pages. Get page details, reach estimates, publisher platforms, report count, number of impressions, ad IDs, timestamps, and more. Download Facebook ads data in JSON, CSV, and Excel and use it in apps, spreadsheets, and reports.

Apify

5.1k

Facebook ad library scraper

curious_coder/facebook-ads-library-scraper

Scrape facebook ads search and ads run by facebook pages - Fast and lightweight

Curious Coder

1.6k

🔥 LinkedIn Jobs Scraper

bebity/linkedin-jobs-scraper

ℹ️ Designed for both personal and professional use, simply enter your desired job title and location to receive a tailored list of job opportunities. Try it today!

Bebity

3.5k

Zillow Detail Scraper

maxcopell/zillow-detail-scraper

Get details of Zillow properties from URLs. This Actor can be easily integrated with other Zillow Scrapers.

Maximillian Copelli

671

How to get data from Hacker News with unofficial HN API

Scraping real estate data from Realtor.com

Scraping real estate data with Python

Build new tools

Are you a developer? Build your own Actors and run them on Apify.

Learn more

Get a custom solution

Get a custom web scraping or RPA solution.

Book a demo

Archived1 Scraper

Archived1 Scraper

🏡 What is Real Estate Properties Scraper?

🚪 What can this Scraper do?

🌳 What data can I extract using this tool?