WordPress Articles Scraper
Pricing
$5.00 / 1,000 results
WordPress Articles Scraper
The WordPress Articles Scraper is an Apify actor that extracts posts and metadata from any WordPress website using the WordPress REST API. It automatically handles pagination and fetches additional information like author details, categories, tags, and featured images.
0.0 (0)
Pricing
$5.00 / 1,000 results
0
Monthly users
2
Runs succeeded
>99%
Last modified
3 days ago
Overview
The WordPress Articles Scraper is an Apify actor that extracts posts and metadata from any WordPress website using the WordPress REST API. It automatically handles pagination and fetches additional information like author details, categories, tags, and featured images.
This actor is perfect for researchers, content aggregators, and developers who need structured data from WordPress sites.
How It Works
- You provide the WordPress URL.
- The actor fetches posts, handling pagination automatically.
- If a search keyword is provided, it filters results accordingly.
- It extracts metadata such as author name, categories, tags, and featured images.
- The final structured JSON output includes all relevant post details.
Features
✅ Fetches posts from any WordPress site ✅ Supports pagination until all posts are retrieved ✅ Filters posts based on search terms ✅ Extracts metadata like author, categories, tags, and featured images ✅ Provides clean and structured JSON output
Getting Started
1. Input Parameters
To use the scraper, provide the following inputs:
Parameter | Type | Required | Description |
---|---|---|---|
startUrls | Array | ✅ | List of URLs to start crawling from (e.g., [{"url": "https://example.com", "method": "GET"}] ). |
2. Running the Actor
You can run the actor directly on Apify or via API:
Using Apify Interface
- Navigate to the actor's Apify page.
- Enter the required parameters.
- Click Run and wait for the data to be scraped.
Using Apify API
1curl -X POST -H "Content-Type: application/json" \ 2 -d '{"maxRequestsPerCrawl": 1, "perPage": 10, "startUrls": [{"url": "https://example.com", "method": "GET"}]}' \ 3 "https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_API_TOKEN"
Output Format
The output is a JSON dataset containing structured post details:
1[ 2 { 3 "id": 123, 4 "date": "2025-03-28T12:00:00", 5 "modified": "2025-03-28T14:00:00", 6 "slug": "example-post", 7 "link": "https://example.com/example-post", 8 "title": "Example Post Title", 9 "content": "<p>This is an example post content...</p>", 10 "excerpt": "This is a short summary...", 11 "author": "John Doe", 12 "categories": ["Technology", "News"], 13 "tags": ["AI", "Programming"], 14 "featured_image": "https://example.com/wp-content/uploads/featured-image.jpg", 15 "extra_metadata": { 16 "author_bio": "John Doe is a technology journalist...", 17 "category_description": "Latest news in tech industry..." 18 } 19 } 20]
Use Cases
- Content Aggregation – Collect and analyze posts from different WordPress sites.
- SEO Research – Extract content and metadata for SEO analysis.
- Data Science – Gather datasets for NLP or sentiment analysis.
- Backup and Archiving – Store blog content for future reference.
Support & Contributions
If you encounter any issues or have feature requests, feel free to open an issue or contribute to the project. Happy scraping! 🚀
Pricing
Pricing model
Pay per resultThis Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.
Price per 1,000 items
$5.00