No credit card required

Website Content Crawler

apify/website-content-crawler

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo

Back to issues Create new issue

Scrapping only few elements on the page and save them in the separate fields

Closed

igocza opened this issue

I wanted to scrap only few texts from the page https://listings.icarlton.com/en/property/apartment-for-rent-in-hidd-140.html. I added their CSS classes in "Keep HTML elements (CSS selector)" but in the output I don't see these values and I see all content scrapped from the page. Also, is there any way to write an instruction that the output will be structured in the way that I am setting the name of the column A in Excel e.g. "property name" then in cell A2 will be a value from class "wtp_text_center_mob tw-text-4xl tw-font-bold tw-text-left tw-mt-0"?

Jiří Spilka (jiri.spilka)

Hi,

Thank you for using the Website Content Crawler.

I’ve looked into your runs, and if I understand correctly, you need to scrape structured information from the listing.

The Website Content Crawler is primarily designed to retrieve all content from a website. For extracting structured data, a better tool is the Web Scraper.

I’ve put together a basic Web Scraper to save the data into Apify's dataset. Please check this run—you can copy the input from the run and paste it into your Web Scraper, and it should work.

The scraper includes glob patterns to scrape only property data and contains a few selectors to extract the required information.

I hope this helps! I’ll go ahead and close this issue for now, but please feel free to ask any additional questions.

igocza

thank you very much Jiri

Add comment

Developer

Apify

Actor Metrics

3.9k monthly users
718 stars
>99% runs succeeded
2.2 days response time
Created in Mar 2023
Modified 15 hours ago

Categories

Developer tools

Fast Website Content Crawler

6sigmag/fast-website-content-crawler

A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.

David Deng

Deep Website Content Crawler

6sigmag/deep-website-content-crawler

Scrape Failed Killer! A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.

David Deng

AI Website Content Markdown Scraper

quaking_pail/ai-website-content-markdown-scraper

This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.

AI_Builder

217

Example Website Screenshot Crawler

dz_omar/example-website-screenshot-crawler

Automated website screenshot crawler using Pyppeteer and Apify. This open-source actor captures screenshots from specified URLs, uploads them to the Apify Key-Value Store, and provides easy access to the results, making it ideal for monitoring website changes and archiving web content.

Omar Abdlhakim

(New) Best Website Traffic Generator

harshmaur/website-traffic-generator

Draw targeted traffic to your websites, mimic real user behaviour and view their stats on Google analytics. Auto discover links on the page and crawl the entire websites in few minutes while giving you complete control. Use the tool to generate page views, stress test the website.

Harsh Maur

1.4k

Google Maps Scraper

compass/crawler-google-places

Extract data from hundreds of Google Maps locations and businesses. Get Google Maps data including reviews, images, contact info, opening hours, location, popular times, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

80.3k

637

📩📍 Google Maps Email Extractor

lukaskrivka/google-maps-with-contact-details

Extract Google Maps contact details. Scrape websites of Google Maps places for contact details and get email addresses, website, location, address, zipcode, phone number, social media links. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Lukáš Křivka

8.8k

228

Instagram Scraper

apify/instagram-scraper

Scrape and download Instagram posts, profiles, places, hashtags, photos, and comments. Get data from Instagram using one or more Instagram URLs or search queries. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify

62.2k

603

Google Maps Reviews Scraper

compass/Google-Maps-Reviews-Scraper

Extract all reviews of Google Maps places using place URLs. Get review text, published date, response from owner, review URL, and reviewer's details. Download scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Compass

4.7k

Tripadvisor Scraper

maxcopell/tripadvisor

This unofficial Tripadvisor API is a data extraction tool able to get data on hotels, restaurants, things to do, vacation rentals, attractions, tours, and public trips. Get pricing, contact details, amenities, awards, ratings, and more. Download your data in Excel, JSON, CSV, and other formats.