YouTube Shorts Sponsorship Signals Scraper avatar

YouTube Shorts Sponsorship Signals Scraper

Pricing

Pay per usage

Go to Apify Store
YouTube Shorts Sponsorship Signals Scraper

YouTube Shorts Sponsorship Signals Scraper

Detect brand-deal signals in a YouTube channel's recent Shorts — spoken sponsor mentions in the transcript, disclosure hashtags, @brand/domain mentions — and score each Short's sponsorship likelihood (0.0–1.0).

Pricing

Pay per usage

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share


🎯 What this scrapes

Influencer-intel SaaS dashboards charge $200–2 000/mo to tell agencies which creators run paid placements. This Actor surfaces the same signal from public data for a fraction of the cost. For every recent Short on a channel it pulls the title, view count, duration, tags, description, and — crucially — the spoken transcript, then scores how likely the Short is sponsored and lists the exact phrases that fired.

Shorts descriptions are mostly empty, so caption-only scrapers miss most deals. We read what the creator actually says, which is where Shorts sponsorships live.

🔥 What we handle for you

  • 🛡️ Browser fingerprint impersonationcurl-cffi mimics a real Chrome TLS + HTTP/2 handshake, so YouTube sees a browser, not a bot.
  • 🍪 Consent-gate handling — we send the consent cookie YouTube requires before it will list a channel's Shorts. Skip it and you get zero results; we don't skip it.
  • 🗣️ Transcript-first detection — spoken sponsor language is the primary signal, caught even when the description is blank.
  • 🌐 Optional proxy rotation via Apify Proxy — datacenter is fine at low volume; residential scales it up.
  • 🧱 Graceful degradation — a Short with no captions still emits a row (has_transcript: false) instead of crashing the run.
  • 💰 Pay-Per-Event pricing — you only pay for Shorts that land in your dataset.

💡 Use cases

  • Influencer-agency intel — see which creators run brand deals and which brands keep coming back.
  • Competitive brand monitoring — track where a competitor's products show up across creators' Shorts.
  • Sponsorship benchmarking — measure how often a niche's top channels run paid placements.
  • Disclosure compliance audits — flag Shorts with spoken sponsorships but no #ad disclosure.
  • Creator vetting — gauge a channel's commercial density before signing a deal.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. Add one or more channel handles — @mkbhd, mkbhd, or a full youtube.com/@mkbhd URL all work.
  3. Click Start. Rows stream into the run's dataset as each Short is scored.
  4. Export from Storage → Dataset as JSON, CSV, or Excel — or fetch via the API.

📥 Input

FieldTypeRequiredDefaultNotes
channelsarrayyes["@mkbhd"]Channel handles or URLs. Normalised to @handle.
maxShortsPerChannelintegerno201–200. How many recent Shorts to inspect per channel.
languagestringno"en"Preferred transcript language; falls back to any available track.
minSponsorshipScorenumberno0.00–1. Only emit rows scoring ≥ this. Set 0.5 to keep only likely-sponsored Shorts.
proxyConfigurationobjectno{"useApifyProxy": false}Optional. Direct (no proxy) by default — YouTube degrades the player and transcript endpoints for shared datacenter IPs. Enable residential proxy only if you hit rate limits at high volume.

Example input

{
"channels": ["@mkbhd"],
"maxShortsPerChannel": 20,
"language": "en",
"minSponsorshipScore": 0.0,
"proxyConfiguration": { "useApifyProxy": false }
}

📤 Output

Every row is one Short.

FieldTypeNotes
channelstringChannel handle (@handle).
video_idstringYouTube Short ID (11 chars).
urlstringCanonical Shorts URL.
titlestringShort title.
published_text['string','null']Human-readable publish text when available.
view_count['integer','null']Views at scrape time.
length_seconds['integer','null']Duration in seconds.
descriptionstringShort description (often empty for Shorts).
tagsarrayVideo keywords/tags.
has_transcriptbooleanTrue when a spoken transcript was retrieved.
transcript_charsintegerCharacter length of the transcript.
hashtagsarray#hashtags found in description/transcript.
mentionsarray@mentions found in the description.
detected_brandsarrayBrands inferred from mentions, domains, and tokens after code/sponsored by.
sponsorship_signalsarrayThe literal phrases/tokens that fired.
sponsorship_scorenumberHeuristic sponsorship likelihood, 0.0–1.0.
is_likely_sponsoredbooleanTrue when sponsorship_score >= 0.5.
scraped_atstringISO-8601 timestamp of the scrape.

Example output

{
"channel": "@mkbhd",
"video_id": "n3V3LZh_r40",
"url": "https://www.youtube.com/shorts/n3V3LZh_r40",
"title": "My everyday carry, sponsored",
"published_text": null,
"view_count": 2805352,
"length_seconds": 58,
"description": "#ad",
"tags": ["tech", "edc"],
"has_transcript": true,
"transcript_chars": 712,
"hashtags": ["#ad"],
"mentions": [],
"detected_brands": ["Surfshark", "MARQUES"],
"sponsorship_signals": ["this video is sponsored", "use code", "#ad"],
"sponsorship_score": 1.0,
"is_likely_sponsored": true,
"scraped_at": "2026-06-13T10:00:00+00:00"
}

🧮 How the score works

A transparent weighted sum over transcript + description + title, clamped to [0, 1]:

SignalExamplesWeight
Spoken strongsponsored by, this video is sponsored, for sponsoring, use code, promo code, paid partnership+0.5 each
Spoken weakcheck out, link in bio, link in description, discount, % off+0.15 each
Disclosure hashtag#ad, #sponsored, #spon, #partner, #partnership, #collab+0.4 each
@mention / brand domain@brand, brand.com (in description)+0.1 each

A single strong spoken signal crosses the 0.5 threshold and marks the Short as likely sponsored. The fired phrases are returned in sponsorship_signals so you can audit every call.

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

EventUSDWhat it is
actor-start$0.005One-off warm-up charge per run
result$0.004Per Short written to the dataset

Example: 1 000 Shorts ≈ $4.00. No subscription, no minimum, no card to start — Apify gives every new account $5 of free credit.

🚧 Limitations

The score is a heuristic, not a legal determination — treat is_likely_sponsored as a strong lead, not proof. Some Shorts (music-only, no narration) have no captions; those still emit a row with has_transcript: false and score only on description/hashtag signals. YouTube's official "Includes paid promotion" flag is not third-party accessible, so we infer from spoken + textual signals instead.

❓ FAQ

Why transcripts instead of captions/descriptions?

Because Shorts descriptions are almost always empty. The sponsorship lives in what the creator says, so we read the transcript first and treat description/hashtags as secondary signals.

What handle formats are accepted?

@mkbhd, mkbhd, youtube.com/@mkbhd, and full https://…/@mkbhd/shorts URLs all normalise to @mkbhd.

Do I need a proxy or API key?

No. The data path is keyless and works without a proxy at low volume. Add a proxy if you're scanning many channels.

Can I keep only the sponsored Shorts?

Yes — set minSponsorshipScore to 0.5 and only likely-sponsored Shorts are emitted (and billed).

💬 Your feedback

Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab on Apify Console — we ship fixes weekly and we read every report.