Pricing

$5.00/month + usage

Go to Store

Wordpress Post Scraper - NEW

Try for free

Developed by

Paco

This actor scrapes WordPress blog posts of one or more websites, cleans the HTML content, and pushes flattened JSON data (collects all data it can find in the post). It uses Selenium to handle pages requiring JavaScript rendering.

0.0 (0)

Pricing

$5.00/month + usage

Total users

109

Monthly users

Runs succeeded

>99%

Issues response

7.5 hours

Last modified

5 days ago

Automation

E-commerce

SEO tools

WordPress Scraper Actor

The WordPress Scraper Actor allows you to easily scrape content from (multiple) WordPress websites, including blogs, articles, author details, categories, comments and media. It uses the WordPress REST API, Requests library and if necessary Selenium for accurate data extraction. Only works on WP sites that accept REST API calls

Features

Extract blog posts, articles, author information, products, categories, comments and images from WordPress websites.
Uses REST API and Selenium for complete data extraction.
Outputs cleaned HTML content as plain text in JSON format.
Supports pagination for comprehensive scraping.

How It Works

The actor takes a single or multiple website URLs as input, interacts with the REST API to gather data, and uses Selenium to handle JavaScript-rendered pages. The scraped data is cleaned and formatted as structured JSON.

Input Parameters

start_urls (required): List of website URLs to scrape (company1.com,company2.com,etc).
max_results (optional): Maximum number of posts to retrieve per site. Set to 'all' for all posts.
scrape_mode (required, default is 'posts'): Choose the data you wish to scrape, you can choose from 'posts', 'media', 'categories','comments'

Output

The actor outputs (cleaned) JSON data for each post, including:

Title
Cleaned Content
Metadata (author, publication date, tags, categories)
Media Links
All post data: All the raw post data in the "All fields" tab

Getting Started

Create an Actor Task: On Apify, create a new actor task and provide the list of URLs to scrape.
Input Configuration: Set start_urls and optionally max_results.
Run the Actor: Execute the actor to start scraping.
Review Results: Download the results as a JSON file.

Use Cases

Content Aggregation: Collect articles or blog posts from multiple WordPress sites.
Market Research: Scrape product descriptions and reviews from WordPress-powered e-commerce sites.
Data Analysis: Gather articles for analysis or summarization.

Important Notes

Respecting Site Policies: Always ensure you have permission to scrape data from a website, and respect the site's robots.txt policies.

Actor Input Example

{
  "start_urls": [
    { "url": "https://example.com" },
    { "url": "https://another-example.com" }
  ],
  "max_results": "all"
}

Actor Output Example (CLEANED)

{
  "title": "Sample Blog Post",
  "cleaned_content": "This is the content of the blog post, without HTML tags.",
  "date_published": "2023-10-01",
}

On this page

WordPress Scraper Actor

Share Actor:

WordPress Scraper

jupri/wordpress

💫 Scrape WordPress websites

cat

348

WordPress Articles Scraper

extremescrapes/wordpress-articles-scraper

The WordPress Articles Scraper is an Apify actor that extracts posts and metadata from any WordPress website using the WordPress REST API. It automatically handles pagination and fetches additional information like author details, categories, tags, and featured images.

Extreme Scrapes

Wordpress Sites Vulnerabilities Scanner

xmiso_scrapers/wordpress-sites-vulnerabilities-scanner

Check wordpress site(s) for their vulnerabilities based on Wordfence vulnerabilities database. Bulk check hundreds WP sites at once if needed.

Miso

5.0

WooCommerce Scraper

jupri/woocommerce

💫 Scrape WooCommerce and WordPress websites

cat

4.1

Replicate Blog Scraper

yourapiservice/replicate-blog-scraper

The Replicate Blog Scraper lets you easily extract blog content in HTML or plaintext formats. It also captures key metadata like author and publication date, making it a great tool for content analysis and research.

Your API Service

🤖 Any Website URL to Article Summarizer

easyapi/any-website-url-to-article-summarizer

Transform any article, blog post, or web content into concise, AI-powered summaries. Get key insights and main points instantly with smart text analysis and markdown formatting. Perfect for researchers, content creators, and busy professionals who need quick, accurate content digests.

EasyApi

5.0

AI News

patrikbraborec/ai-news

An Apify Actor that scrapes AI newsletters, blog posts, and other content sources to create a centralized repository of AI news. The scraped content can be used to feed LLMs for summarization or read directly.

Patrik Braborec

News Articles Scraper

proscraper/news-articles-scraper

Scrape data for news articles. Takes in list of URL's in start_urls and returns the data. Can be used to feed LLM models or training.

Owais Nazir

Article Text Extractor

mtrunkat/article-text-extractor

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

Marek Trunkát

5.0

Shopify Product Details

getdataforme/shopify-product-details

Scrape detailed app info from the Shopify App Store. Get app names, ratings, features, pricing, developer links, and screenshots for any app page. Perfect for market research, app directories, and competitor analysis.