Extracts real news headlines from the CNN homepage or section pages
Optionally follows links to extract full article content, author, and publish date
Outputs clean, structured data for further processing or analysis

Usage

1. Input Options

Configure your run using the following input fields (see .actor/input_schema.json for details):

Field	Type	Description	Default
`startUrls`	array	List of URLs to start scraping from (homepage or section pages)	`["https://www.cnn.com/"]`
`maxHeadlines`	integer	Maximum number of headlines to extract and visit	`20`
`includeArticleDetails`	boolean	If true, scrape full article details for each headline	`false`

Example input:

{
  "startUrls": [
    { "url": "https://edition.cnn.com/" }
  ],
  "maxHeadlines": 10,
  "includeArticleDetails": true
}

2. Output Format

Each result in the dataset will look like:

Headline only:

{
  "title": "Superman’ smashes box office expectations, soaring towards $130 million opening",
  "url": "https://www.cnn.com/2025/07/13/entertainment/superman-box-office-intl",
  "source": "CNN",
  "scrapedAt": "2025-07-13T12:56:40.535Z"
}

With article details (if enabled):

{
  "title": "Superman’ smashes box office expectations, soaring towards $130 million opening",
  "content": "Full article text ...",
  "author": "CNN Staff",
  "publishedDate": "2025-07-13T10:00:00Z",
  "url": "https://www.cnn.com/2025/07/13/entertainment/superman-box-office-intl",
  "source": "CNN",
  "scrapedAt": "2025-07-13T12:56:40.535Z"
}

How It Works

The actor visits the provided start URL(s) and extracts up to maxHeadlines news headlines.
If includeArticleDetails is true, it follows each headline link and scrapes the article's content, author, and publish date.
Results are saved to the default Apify dataset for download in JSON, CSV, or Excel formats.

Customization

To target a specific section (e.g., World, Business), set the appropriate startUrls.
Adjust maxHeadlines to control crawl depth.
Set includeArticleDetails to true for full article scraping.

⚠️ Important Notes

Respect CNN's Terms of Service - Use this Actor responsibly and in accordance with CNN's policies.
Rate Limiting - Avoid making too many requests in a short period to prevent overloading CNN's servers.
Proxy Usage - For large-scale scraping, consider using proxies to avoid IP blocking.
Data Usage - Ensure you have permission to use scraped data for your intended purpose.
Public Content Only - This Actor can only scrape publicly accessible CNN news articles.

⚖️ Legal Disclaimer

This project is intended for educational and research purposes only. When using this Actor, please comply with CNN's Terms of Service and relevant robots.txt policies. Use this tool responsibly and avoid aggressive scraping that could negatively impact CNN's website infrastructure.

Booking.com Hotel Scraper: Scrape hotel data, prices, ratings, and more from Booking.com with advanced anti-detection and flexible extraction limits.

On this page

CNN Top Headlines Scraper Actor

Share Actor:

IMDb Movies Scraper

runtime/imdb-movies-scraper

IMDb Movies Scraper is an Apify Actor that extracts movie details from IMDb's Top 250. It gathers titles, ratings, release years, durations, and certifications into structured JSON. Leveraging Puppeteer and proxy support, it enables efficient, reliable headless data extraction.

Runtime

BBC Scraper

theo/bbc-scraper

Scrape news data from bbc.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

Theo Vasilis

Ynet.co.il Scraper

lexis-solutions/ynet

Scrape news content from ynet.co.il to gather headlines, summaries, and metadata. Ideal for news aggregation, market analysis, and tracking real-time trends. Fast, structured, and customizable extraction from an Israel-based source.

Lexis Solutions

5.0

Walmart Savings

runtime/walmart-savings

The Walmart Savings Scraper is an Apify Actor that extracts discounted products from Walmart’s savings page (https://www.walmart.com/shop/savings).

Runtime

1.0

Booking Scraper

runtime/booking-scraper

This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.

Runtime

5.0

Ultimate News API

glitch_404/Ultimate-News-Scraper

news scraper to scrape up to 10K news articles from over 4500 news sources in less than 20 minutes news from over 20 categories .e.g. Crypto news, World News, Latest News, Celebrities News, and a lot more. you can get news from websites like Fox News, BBC News, CNN News, Crypto and Cryptocurrencies.

Yousif Wael

128

Fox News Scraper

harvest/fox-news-scraper

Extracts the latest news articles from Fox News, categorized by different feeds (e.g., Latest, World News, Politics, Technology, etc.). The scraper returns structured data including article titles, links, publication dates, and content.

Harvest Data

Fox News Scraper

hanatsai/fox-news-scraper

Scrape news data from foxnews.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

Hana Tsai

133

New York Times Scraper

theo/new-york-times-scraper

Scrape news data from nytimes.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

Theo Vasilis

178

The Guardian Scraper

theo/the-guardian-scraper

Scrape news data from theguardian.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

Theo Vasilis