Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready avatar

Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready

Pricing

Pay per event

Go to Apify Store
Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready

Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready

Scrape any public Telegram channel and enrich each post with Gemini AI — sentiment, topics, summaries, translation, entities, and image descriptions. Built-in content moderation. MCP-ready for Claude Desktop and AI agents.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ML Boost

ML Boost

Maintained by Community

Actor stats

1

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

Telegram Scraper + AI Analysis

Scrape any public Telegram channel and get full message metadata — reactions, views, link previews, media URLs, forwards, replies — with optional Gemini AI enrichment for sentiment, topics, summaries, translation, and named entities.

Built for AI agents, data pipelines, and analytics workflows. No Telegram API credentials required — works on public channel previews.


What you get

  • 17+ fields per message: plain text, HTML with formatting, ISO timestamps, views as integers, full reactions array (including paid Telegram Stars), media URLs, video duration, link previews, extracted URLs and emails
  • Channel metadata: title, description, subscribers, photo/video/link counts, verification status, avatar
  • 6 AI enrichments (paid Apify plans only): sentiment, topic tags, one-line summaries, translation, entity extraction, and image descriptions (Gemini vision) — each enabled independently. Free-tier users get full scraping; AI flags are silently skipped with a notice record
  • Built-in content moderation via OpenAI (text + image): policy-violating posts are flagged and skipped before any AI call, so you don't pay tokens on abusive content
  • Two output formats: JSON (default, full fidelity) or Markdown with YAML frontmatter (drop-in for RAG pipelines)
  • Incremental mode: store the last-seen message ID per channel and only fetch new posts on subsequent runs — perfect for Apify Schedules

Quick start

Minimal input:

{
"channels": ["durov"],
"maxMessagesPerChannel": 200
}

With AI enrichment:

{
"channels": ["durov", "telegram"],
"maxMessagesPerChannel": 500,
"enrichSentiment": true,
"enrichTopics": true,
"enrichSummary": true
}

Channel inputs accept any of: durov, @durov, https://t.me/durov, t.me/s/durov.


Pricing (pay-per-event)

EventPriceMarketing display
Actor run start$0.003$0.003 per run
Channel info record$0.001$1 per 1,000 channels
Message record$0.003$3 per 1,000 messages
AI enrichment call$0.0008$0.80 per 1,000 calls
Gemini input tokens (per 100)$0.0003$3 per 1M tokens
Gemini output tokens (per 100)$0.0015$15 per 1M tokens

Typical costs

ScenarioApprox cost
1,000 messages, no AI~$3.00
1,000 messages with sentiment~$5.00
1,000 messages with sentiment + topics~$7.00
1,000 messages with all 5 text enrichments~$13.00
1,000 messages with 1 image each, descriptions only~$10.00
10,000 messages with sentiment (typical run)~$50.00

AI enrichment is opt-in per flag — pure scraping never triggers any AI cost. Token charges use Google's exact reported usage, with our 5–6× markup priced in.


Input parameters

FieldTypeDefaultDescription
channelsstring[]requiredChannel usernames or t.me/... URLs
maxMessagesPerChannelint200Capped at 5,000
outputFormatenumjsonjson / markdown / both
includeChannelInfobooltrueEmit one channel metadata record per channel
incrementalModeboolfalseResume from KV-store cursor
enrichSentimentboolfalse{label, score, rationale}
enrichTopicsboolfalse1–5 tags from controlled vocabulary
enrichSummaryboolfalseOne-sentence summary
enrichTranslationboolfalseTranslate to translationTarget
translationTargetstringenLanguage code (en, es, de, ru, zh, …)
enrichEntitiesboolfalse{companies, people, places, tokens}
enrichImageDescriptionsboolfalse1–2 sentence vision description for each image (up to 4/post)

Output schema

Each dataset record has a recordType field: either channel_info or message.

channel_info record

{
"recordType": "channel_info",
"channel": "durov",
"title": "Pavel Durov",
"username": "durov",
"description": "Founder of Telegram.",
"subscribers": 11600000,
"photos": 96,
"videos": 40,
"links": 183,
"verified": true,
"avatarUrl": "https://cdn4.telesco.pe/...",
"url": "https://t.me/durov"
}

message record

{
"recordType": "message",
"channel": "durov",
"messageId": 489,
"url": "https://t.me/durov/489",
"text": "Group admins can assign tags to users …",
"textHtml": "<div ...>...</div>",
"postedAt": "2026-04-15T01:20:29+00:00",
"views": 2580000,
"reactions": [
{"emojiId": "5265077361648368841", "count": 23300, "isPaid": false},
{"count": 3870, "isPaid": true}
],
"reactionsTotal": 42280,
"isForwarded": false,
"isEdited": false,
"author": "Pavel Durov",
"mediaType": "video",
"mediaUrl": "https://cdn4.telesco.pe/...mp4",
"videoDuration": "0:09",
"imageUrls": [],
"linkPreview": null,
"extractedLinks": [],
"extractedEmails": [],
"sentiment": {"label": "positive", "score": 0.95, "rationale": "..."},
"topics": ["tech", "business"],
"summary": "Telegram group admins can now assign user tags…",
"translation": "Los administradores de grupo pueden…",
"entities": {"companies": [], "people": [], "places": [], "tokens": []},
"imageDescriptions": [
{
"url": "https://cdn4.telesco.pe/file/…",
"description": "A smartphone screen displays an audio selection menu …",
"contains_text": true
}
]
}

AI enrichment fields are only present when their corresponding flag is enabled.

If a post fails moderation, all AI fields are skipped and a moderation field is set:

{
"moderation": {"flagged": true, "categories": ["harassment", "violence"]}
}

Hard safety caps

These prevent any accidental runaway cost:

  • 5,000 messages per channel per run
  • 50,000 total messages per run
  • 10,000 total AI enrichments per run (safety cap; typical paid runs are well below)
  • 2,000-character input clamp before each Gemini call
  • 500-token output cap per Gemini call
  • 4 images max per post when image descriptions are enabled
  • 6 MB max image size for vision processing
  • 60 Gemini calls per minute (self-rate-limited)
  • 30-minute run timeout
  • Posts flagged by OpenAI content moderation skip all AI calls automatically

Use it from AI agents (MCP)

This actor is exposed as a tool via the Apify MCP server — Claude Desktop, Cursor, ChatGPT, n8n, and LangChain can all call it with natural language. The input schema is written so an LLM can pick the right flags without needing additional prompting.

Claude Desktop

Add this to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
"mcpServers": {
"apify": {
"url": "https://mcp.apify.com?tools=ml_boost/tg-apify-actor",
"headers": {
"Authorization": "Bearer <YOUR_APIFY_TOKEN>"
}
}
}
}

Restart Claude Desktop and try: "Scrape the last 100 posts from @durov and give me sentiment and topics for each."

Cursor

In Cursor settings → MCP, add a new server:

  • URL: https://mcp.apify.com?tools=ml_boost/tg-apify-actor
  • Headers: Authorization: Bearer <YOUR_APIFY_TOKEN>

Local stdio (npx) — alternative

If you prefer a local wrapper instead of the hosted HTTP endpoint:

{
"mcpServers": {
"apify": {
"command": "npx",
"args": ["-y", "@apify/actors-mcp-server", "--tools", "ml_boost/tg-apify-actor"],
"env": { "APIFY_TOKEN": "<YOUR_APIFY_TOKEN>" }
}
}
}

The hosted endpoint is recommended — it supports output-schema inference, so the agent sees the structured response shape, not just raw JSON.

n8n / LangChain / Zapier

Use the Apify integration node and pick this actor. The input schema renders as a form with sensible defaults; pass channels and toggle enrichments as needed.

ChatGPT Custom GPT

Create a Custom GPT with an Action pointing at https://api.apify.com/v2/acts/ml_boost~tg-apify-actor/run-sync-get-dataset-items?token=YOUR_TOKEN. ChatGPT will read the input schema automatically.

Why this actor is good for AI agents

  • Structured output: Every record has stable, documented field names — no string parsing required.
  • Opt-in enrichment: Agents enable only the AI fields they need, keeping cost low for simple queries.
  • Idempotent incremental mode: Run the same query 1× per hour and only get new posts each time, no duplicates.
  • Built-in moderation: The actor refuses to enrich content that fails OpenAI's policy check, so agents don't accidentally launder abusive content through downstream AI calls.

Scheduling

Combine incrementalMode: true with an Apify Schedule (e.g. every 6 hours) to build a real-time monitor for any channel. The actor stores per-channel cursors in its Key-Value store; only new posts since the last run are fetched and charged.


What channels work

Any public Telegram channel with a t.me preview page — that's most of them. The actor will return a clear error record (recordType: "error") for:

  • Private channels (no preview available)
  • Telegram users or bots (not channels)
  • Channels that don't exist
  • Groups (out of scope for v1 — use Telegram's Bot API or MTProto for groups)

Limits and known scope

This is v1. Features explicitly deferred to later releases:

  • MCP server layer — coming next
  • Comments / discussion replies under posts
  • Group chats (channels only for now)
  • Cross-channel forward deduplication (forwardChainId)
  • Member list extraction (not exposed on t.me previews)
  • Spam / bot scoring

Support

Leave a review or contact the maintainer via the Apify Store actor page. Reviews and bug reports are read and responded to within 24h.