Celebrity Mention Scraper avatar

Celebrity Mention Scraper

Pricing

from $0.10 / site scrape

Go to Apify Store
Celebrity Mention Scraper

Celebrity Mention Scraper

Find out how often celebrities and public figures are talked about across any website. Celebrity Mention Scraper crawls the pages you specify, counts every mention of each name, and tells you exactly which pages they appear on — sorted by frequency.

Pricing

from $0.10 / site scrape

Rating

0.0

(0)

Developer

Kostiantyn

Kostiantyn

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

15 days ago

Last modified

Share

Find out how often celebrities and public figures are talked about across any website. Celebrity Mention Scraper crawls the pages you specify, counts every mention of each name, and tells you exactly which pages they appear on — sorted by frequency.

Before you start: This scraper uses plain HTTP requests — it cannot execute JavaScript or log into websites. Content loaded dynamically (infinite scroll, JS-rendered articles) and pages behind a login, paywall, or cookie consent wall will not be accessible. It works best on public, server-rendered sites such as news outlets, blogs, forums, and wikis.

What does it do?

Give it a list of websites and a list of names. It crawls each site up to a configurable depth, scans every page for whole-word name matches, and returns a ranked breakdown: who is mentioned most, on which pages, and how many times.

Typical use cases:

  • Track how much coverage a celebrity gets on entertainment or news sites
  • Monitor public figures across fan forums, blogs, or review sites
  • Measure influencer presence on a media property before a sponsorship deal
  • Research how often athletes or musicians appear in sports or music publications
  • Compare celebrity coverage across competing publications

How to use

  1. Open the Actor and click Try for free
  2. Enter the websites you want to scan under Start URLs
  3. Add the celebrity or person names you want to track
  4. Click Start — results appear as pages are crawled
  5. Download from the Output tab in JSON, CSV, or Excel

Input

FieldTypeDescription
startUrlsarrayWebsites to crawl (required)
namesarrayCelebrity / person names to search for (required)

Use full names for best accuracy — "Taylor Swift" matches more precisely than "Taylor", which would also pick up unrelated content.

Example:

{
"startUrls": [
{ "url": "https://rollingstone.com" },
{ "url": "https://billboard.com" }
],
"names": ["Taylor Swift", "Beyoncé", "Billie Eilish", "Sabrina Carpenter"]
}

Output

One row per celebrity per site, sorted by total mentions descending.

[
{
"name": "Taylor Swift",
"startUrl": "https://rollingstone.com",
"totalMentions": 47,
"pageCount": 12,
"pages": [
{ "url": "https://rollingstone.com/music/eras-tour-review/", "mentions": 9 },
{ "url": "https://rollingstone.com/music/album-of-year/", "mentions": 6 }
],
"scrapedAt": "2026-04-15T10:00:00.000Z"
}
]
FieldDescription
nameThe celebrity name as entered
startUrlThe site that was crawled
totalMentionsTotal name occurrences across all pages on that site
pageCountNumber of distinct pages the name appeared on
pagesPer-page breakdown, sorted by mention count
scrapedAtTimestamp of the crawl

How matching works

  • Case-insensitive — "beyoncé", "Beyoncé", and "BEYONCÉ" all count
  • Whole-word — "Swift" won't match inside "swiftly"
  • Full name recommended — multi-word names like "LeBron James" are matched as a phrase

Pricing

$0.10 per site that returned at least one result. Sites crawled with zero mentions are not charged.

Advanced settings

FieldDefaultDescription
maxDepth2Link-levels deep to follow. 0 = start URL only, 1 = directly linked pages, 2 = one level deeper. Max: 3
maxRequestsPerSite200Page cap per site. Max: 300

Notes

This Actor uses plain HTTP requests — no browser, no JavaScript execution. It works well on server-rendered sites (news sites, blogs, forums, Wikipedia) but won't capture content loaded dynamically by JavaScript. Social media platforms are not accessible via plain HTTP.