Pricing

Pay per event

Go to Apify Store

WordPress Articles Scraper

Try for free

Developed by

Extreme Scrapes

The WordPress Articles Scraper is an Apify actor that extracts posts and metadata from any WordPress website using the WordPress REST API. It automatically handles pagination and fetches additional information like author details, categories, tags, and featured images.

0.0 (0)

Pricing

Pay per event

Last modified

3 months ago

Automation

E-commerce

News

Overview

The WordPress Articles Scraper is an Apify actor that extracts posts and metadata from any WordPress website using the WordPress REST API. It automatically handles pagination and fetches additional information like author details, categories, tags, and featured images.

This actor is perfect for researchers, content aggregators, and developers who need structured data from WordPress sites.

How It Works

You provide the WordPress URL.
The actor fetches posts, handling pagination automatically.
If a search keyword is provided, it filters results accordingly.
It extracts metadata such as author name, categories, tags, and featured images.
The final structured JSON output includes all relevant post details.

Features

✅ Fetches posts from any WordPress site ✅ Supports pagination until all posts are retrieved ✅ Filters posts based on search terms ✅ Extracts metadata like author, categories, tags, and featured images ✅ Provides clean and structured JSON output

Getting Started

1. Input Parameters

To use the scraper, provide the following inputs:

Parameter	Type	Required	Description
`startUrls`	Array	✅	List of URLs to start crawling from (e.g., `[{"url": "https://example.com", "method": "GET"}]`).

2. Running the Actor

You can run the actor directly on Apify or via API:

Using Apify Interface

Navigate to the actor's Apify page.
Enter the required parameters.
Click Run and wait for the data to be scraped.

Using Apify API

curl -X POST -H "Content-Type: application/json" \
     -d '{"maxRequestsPerCrawl": 1, "perPage": 10, "startUrls": [{"url": "https://example.com", "method": "GET"}]}' \
     "https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_API_TOKEN"

Output Format

The output is a JSON dataset containing structured post details:

[
    {
        "id": 123,
        "date": "2025-03-28T12:00:00",
        "modified": "2025-03-28T14:00:00",
        "slug": "example-post",
        "link": "https://example.com/example-post",
        "title": "Example Post Title",
        "content": "<p>This is an example post content...</p>",
        "excerpt": "This is a short summary...",
        "author": "John Doe",
        "categories": ["Technology", "News"],
        "tags": ["AI", "Programming"],
        "featured_image": "https://example.com/wp-content/uploads/featured-image.jpg",
        "extra_metadata": {
            "author_bio": "John Doe is a technology journalist...",
            "category_description": "Latest news in tech industry..."
        }
    }
]

Use Cases

Content Aggregation – Collect and analyze posts from different WordPress sites.
SEO Research – Extract content and metadata for SEO analysis.
Data Science – Gather datasets for NLP or sentiment analysis.
Backup and Archiving – Store blog content for future reference.

Support & Contributions

If you encounter any issues or have feature requests, feel free to open an issue or contribute to the project. Happy scraping! 🚀

On this page

WordPress Articles Scraper

Share Actor:

WordPress Scraper

jupri/wordpress

💫 Scrape WordPress websites

cat

355

Wordpress Post Scraper - NEW

eloquent_mountain/wordpress-post-scraper---new

This actor scrapes WordPress blog posts of one or more websites, cleans the HTML content, and pushes flattened JSON data (collects all data it can find in the post). It uses Selenium to handle pages requiring JavaScript rendering.

Paco

116

Wordpress Sites Vulnerabilities Scanner

xmiso_scrapers/wordpress-sites-vulnerabilities-scanner

Check wordpress site(s) for their vulnerabilities based on Wordfence vulnerabilities database. Bulk check hundreds WP sites at once if needed.

Miso

5.0

WooCommerce Scraper

jupri/woocommerce

💫 Scrape WooCommerce and WordPress websites

cat

2.1K

4.2

News Articles Scraper

proscraper/news-articles-scraper

Scrape data for news articles. Takes in list of URL's in start_urls and returns the data. Can be used to feed LLM models or training.

Owais Nazir

Smart Article Extractor

lukaskrivka/article-extractor-smart

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

Lukáš Křivka

5.6K

4.7

Articles Extractor

web.harvester/articles-extractor

The Article Extractor is an enterprise-grade web scraping solution designed specifically for extracting structured data from news articles, blog posts, and online publications. Our advanced HTML parsing engine delivers unmatched accuracy in content extraction across thousands of websites.

Web Harvester

560

5.0

News Website Crawler & Article Extractor

xtech/news-source-crawler

Scrape all articles from any news website. Extract full text, metadata, keywords, and summaries. Ideal for content analysis, research, and news aggregation.

Xtech

145

Article Text Extractor

mtrunkat/article-text-extractor

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

Marek Trunkát

1.1K

5.0

Shopify Product Details

getdataforme/shopify-product-details

Scrape detailed app info from the Shopify App Store. Get app names, ratings, features, pricing, developer links, and screenshots for any app page. Perfect for market research, app directories, and competitor analysis.