Twitch VOD Chat Archive avatar

Twitch VOD Chat Archive

Pricing

Pay per event

Go to Apify Store
Twitch VOD Chat Archive

Twitch VOD Chat Archive

Export the full timestamped chat replay attached to any public Twitch VOD as a one-row-per-message dataset. Includes user color, badges, emote IDs, and message offsets. No login. The go-to Twitch chat scraper after earlier Apify Twitch actors were deprecated.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share


🎯 What this scrapes

Twitch retains a complete timestamped chat replay for every VOD as long as the VOD itself exists. This Actor walks the same VideoCommentsByOffsetOrCursor GraphQL endpoint that Twitch's own VOD player uses, paginates by content offset (the only mode that avoids Twitch's integrity-check challenge), and emits one clean row per chat message — ready for analytics, moderation-classifier training, or post-broadcast review.

🔥 What we handle for you

  • 🛡️ Browser fingerprint rotationcurl-cffi impersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a browser, not Python.
  • 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block.
  • 🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per page, Retry-After honoured.
  • 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down instead of getting banned.
  • 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
  • 💰 Pay-Per-Event pricing — you only pay for results that hit your dataset. No data, no charge.

💡 Use cases

  • Community sentiment analysis — bulk-export chat from your last 20 streams and run NLP for hype-moment detection.
  • Moderation-classifier training — gather positive / negative chat samples at scale for an in-house spam or toxicity model.
  • Esports analytics — quantify hype peaks against game events by joining message density to the VOD timeline.
  • Post-broadcast review — streamers and mods download chat for an after-action review, search for usernames, or extract clips with chat context.
  • Academic research — public-record dataset of streamer-viewer conversation for media-studies research.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. Fill in the input form — most fields have sensible defaults.
  3. Click Start. Output streams into the run's dataset.
  4. Export from Storage → Dataset as JSON, CSV, or Excel — or fetch via the API.

📥 Input

FieldTypeRequiredDefaultNotes
vodIdsarrayno[]List of Twitch VOD IDs or full VOD URLs (e.g. https://www.twitch.tv/videos/2773625679). Either this or <cod
channelLoginstringno'—'Twitch channel login (the URL slug, e.g. shroud). Used only when vodIds is empty. The Actor fe
maxRecentVodsintegerno5When channel mode is used, how many most-recent ARCHIVE VODs to pull (1–50).
maxMessagesPerVodintegerno5000Stop walking chat once this many messages have been emitted for a single VOD (1–200000). A 6-hour stream typically has 5
startOffsetSecondsintegerno0Skip chat messages whose offset within the VOD is less than this value. Use 0 to start from the beginning.
proxyConfigurationobjectno{'useApifyProxy': True, 'apifyProxyGroups': ['RESIDENTIAL']}Twitch rate-limits a single IP aggressively past ~10k chat messages. Residential proxy strongly recommended for long VOD

Example input

{
"vodIds": [
"2773625679"
],
"maxMessagesPerVod": 100,
"startOffsetSeconds": 0,
"proxyConfiguration": {
"useApifyProxy": false
}
}

📤 Output

Every row is one dataset item.

FieldTypeNotes
vod_idstringTwitch VOD ID (numeric string).
vod_title['string', 'null']VOD title. Populated when channel mode is used or when a metadata pre-fetch resolved it.
channel_login['string', 'null']Channel login (URL slug) for the VOD. Populated in channel mode.
message_idstringUnique chat message UUID.
message_offset_secondsintegerPosition within the VOD when this message was posted, in seconds.
posted_atstringWall-clock UTC timestamp the message was posted (ISO 8601 with milliseconds, verbatim from Twitch).
commenter_id['string', 'null']Twitch user ID of the commenter. Null for deleted users.
commenter_login['string', 'null']Commenter login (URL slug).
commenter_display_name['string', 'null']Commenter display name.
message_textstringConcatenated plain-text body of the message (emote shortcodes preserved as their literal text).
message_fragmentsarrayStructured fragments: list of {type: 'text'
user_color['string', 'null']User's chat color (hex, e.g. '#DAA520'). Null when not set.
badgesarrayList of {set_id, version} dicts. Empty list when the user has none.
is_subscriberbooleanConvenience: true when 'subscriber' is in badges.
scraped_atstringWhen this row was emitted (ISO 8601 UTC).

Example output

{
"vod_id": "2773625679",
"vod_title": "never played forza but i definitely have a drivers license so it should be easy",
"channel_login": "shroud",
"message_id": "1292e052-0561-4db5-86c7-adfc4556d628",
"message_offset_seconds": 12,
"posted_at": "2026-05-16T18:42:35.297Z",
"commenter_id": "142680597",
"commenter_login": "tabrexs",
"commenter_display_name": "tabrexs",
"message_text": "PewPewPew",
"message_fragments": [
{
"type": "emote",
"text": "PewPewPew",
"emote_id": "emotesv2_587405136a8147148c77df74baaa1bf4"
}
],
"user_color": "#DAA520",
"badges": [],
"is_subscriber": false,
"scraped_at": "2026-05-16T19:00:00Z"
}

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

EventUSDWhat it is
actor-start$0.05One-off warm-up charge per run
result-row$0.001PPE event

Example: 1 000 results at the rates above ≈ $0.05. No subscription, no minimum, no card to start — Apify gives every new account $5 of free credit.

🚧 Limitations

Twitch's public VOD chat replay endpoint is the only data source — no OAuth, no moderator-action log, no live chat, no DMs. On the FREE Apify plan only the BUYPROXIES94952 datacenter group is provisioned (5 IPs); residential proxy gives much better tolerance for long VODs and is recommended on paid plans. The persisted-query hash this Actor uses is a public Twitch web-player constant — if Twitch rotates it on a schema change, we ship a same-day patch.

❓ FAQ

Does this scrape live chat?

No — VOD chat replays only. Live chat is a different IRC-over-WSS protocol. Once the broadcast ends and the VOD is processed, its chat replay becomes accessible via this Actor.

Why are some VODs returning zero messages?

Most common causes: (a) the VOD has subscriber-only chat enabled, so anonymous queries get nothing; (b) the VOD has expired (default accounts retain VODs for 60 days, Partners / Affiliates / Turbo retain indefinitely); (c) the channel disabled chat replay. We surface the cause in the run status message.

Why does a long VOD take so long?

Twitch returns about 50–60 messages per page and rate-limits a single IP aggressively past ~10k messages. The Actor uses one in-flight request at a time and backs off on 429. For long VODs default to residential proxy.

What about emote images?

We return the Twitch emote_id in each emote fragment. You construct the CDN URL yourself: https://static-cdn.jtvnw.net/emoticons/v2/&lt;emote_id&gt;/default/dark/3.0.

Are moderator actions (bans, timeouts) included?

No — the public chat-replay endpoint does not expose moderator action logs. Deleted messages may appear as <message deleted> or not at all, depending on when they were removed.

💬 Your feedback

Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab on Apify Console — we ship fixes weekly and we read every report.