Pricing

$19.99/month + usage

Go to Store

Article Content Extractor 📄

Try for free

Developed by

EasyApi

Extract clean article content, metadata and structured information from any web page. Supports multiple URLs and returns well-formatted JSON with title, description, content, author, publish date and more. 🔍📄

0.0 (0)

Pricing

$19.99/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

5 months ago

Developer tools

Integrations

Other

Extract clean article content and metadata from any web pages automatically. This actor helps you get structured content from news sites, blogs, and other article-based websites.

Features ✨

Extract article content and metadata from any URL
Support batch processing of multiple URLs
Clean and structured JSON output
Built-in rate limiting to avoid overloading target sites
Robust error handling and validation
Fast and efficient processing

Output Data Structure 📊

The actor extracts the following information from each article:

Title
Description
Main content (both HTML and plain text)
Author
Publication date
Source domain
Featured image URL
Related links
Tags
Scraping timestamp

Use Cases 💡

Content aggregation and syndication
News monitoring and analysis
Research and data collection
Content migration
SEO analysis
Digital archiving

Limitations ⚠️

Respects robots.txt and implements polite scraping
2-second delay between requests to avoid overwhelming target servers
URLs must be valid and accessible
Content extraction quality depends on page structure

Tips for Best Results 💪

Provide valid, accessible URLs
Use for public content only
Consider target website's terms of service
Monitor execution logs for any issues

Need help or have questions? Feel free to reach out!

Input Example

A full explanation of an input example in JSON.

{
    "urls": [
        "https://cleartax.in/s/gst-hsn-lookup",
        "https://www.fancode.com/pickleball/schedule"
    ]
}

Output sample

The results will be wrapped into a dataset which you can always find in the Storage tab. Here's an excerpt from the data you'd get if you apply the input parameters above:

And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML.

[
    {
        "url": "https://www.fancode.com/pickleball/schedule",
        "title": "Pickleball Schedule - Check International and Domestic matches on FanCode",
        "description": "ABOUT FANCODEIndia's Premium Live Streaming, Live Scores &amp; Sports Merchandise Shopping platform FanCode has grown to become one of the most loved and followed all-sports destination in the last few years....",
        "content": "<div><p><label>ABOUT FANCODE</label><label>India's Premium Live Streaming, Live Scores &amp; Sports Merchandise Shopping platform FanCode has grown to become one of the most loved and followed all-sports destination in the last few years. The FanCode app has been downloaded by more than 3+ crore users. It offers interactive live streaming of all major sporting events, premier cricket tournaments, women's cricket, live football, basketball, baseball, wrestling, badminton, and other major sports. It also offer real-time match highlights, match videos, cricket videos, India cricket highlights, highlights of today's match, highlights of yesterday's match, cricket data, statistics, cricket analysis, fantasy insights, cricket updates, breaking news from India cricket and world of sports. It also offers sports merchandise for all major sporting leagues and teams from across the world.</label></p></div>",
        "author": "",
        "publishedDate": "",
        "source": "fancode.com",
        "image": "https://www.fancode.com/skillup-uploads/fc-web/home-page-new-arc/hero-image/v1/hero-image-dweb-v4.png",
        "links": [
            "https://www.fancode.com/pickleball/schedule"
        ],
        "tags": [],
        "scrapedAt": "2025-02-05T07:19:26.119Z"
    },
    ...
]

📄 URL Metadata Crawler - Extract comprehensive metadata from web pages including meta tags, favicons, and Open Graph tags.
🔍 Google News Scraper - Collect up to 5000 news articles with flexible search options and language support.
📚 arXiv Search Scraper - Extract comprehensive research paper data including titles, authors, and abstracts.
🔬 Nature Search Results Scraper - Extract research article data from Nature.com with detailed metadata.
📚 Medium Posts Search Scraper - Get detailed information about articles, authors, and engagement metrics from Medium.
📚 Substack Posts Scraper - Extract comprehensive post data including title, author, and publication details.
🔍 PubMed Search Scraper - Scrape research papers and academic articles with comprehensive metadata.
📄 WikiHow Article Scraper - Extract article titles, dates, views, and detailed step-by-step content.
🔍 Cointelegraph Search Scraper - Extract comprehensive article data including titles, authors, and publish dates.
📚 Medium User Posts Scraper - Extract detailed post data including engagement metrics and publication details.
🎯 Keyword Discovery Tool - Discover new keyword ideas and uncover valuable search insights.
🔍 Keyword Density Checker - Analyze webpage content to calculate keyword density and frequency.
🔍 AI-powered Search - Transform search queries into structured, AI-powered summaries with references.
📝 Text Summarization - Automatically generate concise summaries of documents while preserving original content.
🌐 Website Content to Markdown for LLM Training - Transform web content into clean, LLM-ready Markdown format.

On this page

Article Content Extractor 📄

Share Actor:

Smart Article Scraper - Text, Data & Insights

xtech/article-extractor

Unlock valuable insights from any article! Get clean text, publication data, keywords, summaries, and more. Ideal for research, content marketing, and competitive analysis. Fast, reliable, and easy to use.

Xtech

News Article Scraper for Feeding LLM

proscraper/newsarticlescraper

Scrape news articles metadata to feed into LLM models. Returns article body, published date, article title, author etc.

Owais Nazir

News Website Crawler & Article Extractor

xtech/news-source-crawler

Scrape all articles from any news website. Extract full text, metadata, keywords, and summaries. Ideal for content analysis, research, and news aggregation.

Xtech

128

Smart Article Extractor

lukaskrivka/article-extractor-smart

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

Lukáš Křivka

5.4K

4.7

Articles Extractor

web.harvester/articles-extractor

The Article Extractor is an enterprise-grade web scraping solution designed specifically for extracting structured data from news articles, blog posts, and online publications. Our advanced HTML parsing engine delivers unmatched accuracy in content extraction across thousands of websites.

Web Harvester

540

5.0

🤖 Any Website URL to Article Summarizer

easyapi/any-website-url-to-article-summarizer

Transform any article, blog post, or web content into concise, AI-powered summaries. Get key insights and main points instantly with smart text analysis and markdown formatting. Perfect for researchers, content creators, and busy professionals who need quick, accurate content digests.

EasyApi

5.0

Ultimate Articles Extractor

web.harvester/ultimate-articles-extractor

A powerful and modular web scraping tool designed to extract content from any webpage, article, or news site. Get clean, structured data from any website with optimized extraction algorithms, anti-bot detection avoidance, and proxy support.

Web Harvester

5.0

Advanced News Scraper

dorcy/advanced-news-scraper

This scraper is crafted to extract the latest news articles based on custom search queries, providing a wealth of information, including article titles, sources, publication dates, full article text, and AI-generated summary.

Dorcy Shema

205

Tech News Article Scraper

inquisitive_sarangi/news-article-scraper

Tech News Article Scraper is a simple yet powerful tool to extract news articles from a variety of popular news websites. Supported The Verge, CNET, Wired, TechCrunch, Ars Technica