WordPress Post Scraper
Pricing
from $4.00 / 1,000 results
WordPress Post Scraper
Extract every blog post from any WordPress site — title, content, date, author, image, categories and tags.
Pricing
from $4.00 / 1,000 results
Rating
0.0
(0)
Developer
Harish Garg
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract every blog post from any WordPress website in minutes — no code, no logins, no setup, no plugins to install. WordPress Post Scraper turns any WordPress blog into clean, structured data (JSON, CSV, Excel, HTML or XML) you can drop straight into a spreadsheet, database, AI workflow or content pipeline. Built to handle sites that block ordinary scrapers.
What does WordPress Post Scraper do?
WordPress Post Scraper extracts posts from any public WordPress site, including blogs hosted on wordpress.com and self-hosted WordPress installations. Paste a website URL, click Start, and the scraper returns a clean list of every post on the site with title, full content, publish date, author, featured image, categories and tags.
It works on millions of sites: news outlets, company blogs, product update pages, agency portfolios, personal blogs, niche publications and more — anywhere WordPress powers the content. The scraper is engineered to look and behave like a real visitor, so it works reliably on sites that block generic scraping tools. You can try it on wordpress.org/news in seconds, with zero configuration.
Why use WordPress Post Scraper?
WordPress runs more than 40% of the web, which means most of the world's blog content lives on it. This scraper lets you turn that content into data for whatever you're building:
- Competitor and market research — monitor what your competitors publish, how often, and on which topics.
- SEO and content analysis — analyze keyword usage, posting frequency, internal linking and content strategy across entire sites.
- AI training data and RAG pipelines — bulk-collect high-quality long-form content to feed into LLM fine-tunes, embeddings, summarization tools or knowledge bases.
- News and trend monitoring — build automated news feeds, alerts and dashboards from industry publications.
- Content migration and backup — export an entire WordPress blog before a redesign, platform migration, or as a recurring offsite archive.
- Lead generation — extract author names, bylines and contact pages from target blogs.
- Newsletter and social automation — pull fresh posts into Make, Zapier, n8n, Google Sheets or your CMS on a schedule.
Because the scraper runs on Apify, you also get scheduled runs, a REST API, webhooks, proxy rotation, dataset storage and integrations with Make, Zapier, Google Drive, Slack, Airtable and more — all included.
How to use WordPress Post Scraper
- Click Try for free at the top of this page (you'll be asked to sign in — it's free).
- Paste the homepage URL of the WordPress site you want to scrape into the WordPress site URL field (e.g.
https://wordpress.org/news). - Optionally set Maximum posts to limit how many posts you want. Leave it at
0to scrape the entire blog. - Click Start.
- When the run finishes, open the Output tab and download the data as JSON, CSV, Excel, HTML or XML — or pull it via the Apify API.
That's it. No accounts, no API keys, no plugins, no installation on the target site.
Input
You only need to provide a website URL. Everything else has a sensible default.
| Setting | What it does |
|---|---|
| WordPress site URL | The homepage of the WordPress blog you want to scrape. |
| Maximum posts | Total cap on posts returned. Use 0 to scrape the entire site. |
| Posts per page | How many posts to fetch per request. Higher = faster runs. |
| Include full post content | Turn off to get only titles, excerpts and metadata (smaller, cheaper output). |
| Delay between requests | Politeness delay so the target site stays happy. |
| Proxy configuration | Optional Apify Proxy — useful for sites that rate-limit cloud IPs. |
Output
Each post is saved as one row in the dataset. Here's a real example:
{"id": 18432,"title": "WordPress 6.7 \"Rollins\" Released","slug": "wordpress-6-7-rollins","url": "https://wordpress.org/news/2024/11/rollins/","date": "2024-11-12T15:00:00","modified": "2024-11-13T10:22:00","excerpt": "Say hello to WordPress 6.7 \"Rollins\"...","content": "<p>Full rendered HTML of the post body...</p>","author": "WordPress Core Team","featuredImage": "https://wordpress.org/news/files/2024/11/rollins.jpg","categories": ["Releases"],"tags": ["wordpress-6-7"]}
You can download the dataset in various formats such as JSON, CSV, Excel, HTML or XML, or query it directly through the Apify API and integrations.
Data fields
| Field | Description |
|---|---|
id | Unique WordPress post ID. |
title | Post headline. |
slug | URL-friendly post identifier. |
url | Full public URL of the post. |
date | When the post was originally published. |
modified | When the post was last updated — perfect for change tracking. |
excerpt | Short summary of the post. |
content | Full HTML body of the post. |
author | Display name of the post's author. |
featuredImage | URL of the post's main image. |
categories | List of WordPress categories the post belongs to. |
tags | List of tags attached to the post. |
How much does it cost to scrape WordPress posts?
WordPress Post Scraper is highly efficient — it uses a lightweight fetch strategy that needs far less compute than browser-based scrapers. In practice, thousands of posts can be scraped well within the free tier's monthly platform credits.
Costs grow with:
- The total number of posts you scrape.
- Whether you include the full post body (long articles take more storage).
- Whether you enable Apify Proxy (recommended for protected sites).
For an exact estimate before a big run, try a small Maximum posts value first and check the run summary.
Tips for the best results
- Start small. Set
Maximum poststo10on your first run to verify the site is supported before doing a full export. - Use Apify Proxy for protected sites. If a site returns errors, switch the proxy on — most rate-limit issues disappear immediately.
- Schedule recurring runs. Use Apify's built-in scheduler to refresh your dataset daily, hourly, or whenever the target site publishes new content.
- Connect to your stack. Push results straight to Google Sheets, Airtable, Slack, your database or your CMS using Apify's no-code integrations.
- Track changes over time. The
modifiedfield lets you detect when existing posts are updated, not just when new ones are published.
FAQ
Which WordPress sites does this work on? Any public WordPress site where posts are visible to anonymous visitors — which is the default for both self-hosted WordPress and wordpress.com blogs. The vast majority of WordPress sites on the internet are supported out of the box, including many that block ordinary scraping tools.
Do I need login credentials or an API key? No. The scraper only reads publicly available content, so no logins, tokens or plugins are required.
Can I scrape an entire blog at once?
Yes. Set Maximum posts to 0 and the scraper will paginate through the site until every post has been collected.
Can I get just the most recent posts? Yes — set Maximum posts to however many recent posts you want. The scraper returns the newest posts first.
Is web scraping WordPress sites legal? Scraping publicly available content is generally permitted, but you should respect each site's terms of service, robots.txt, copyright and applicable privacy laws. Don't republish copyrighted content without permission.
My target site doesn't seem to work — what now? A small number of WordPress sites are configured to hide their posts from automated visitors entirely. If your target site is one of them, please open an issue on the Issues tab and we'll take a look — or reach out for a custom solution.
Can I get help or request features? Yes. Use the Issues tab on this Actor's page to report bugs, ask questions or request new features. We read every report.