Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready
Pricing
Pay per event
Telegram Scraper + AI Analysis — Posts, Sentiment, MCP-Ready
Scrape any public Telegram channel and enrich each post with Gemini AI — sentiment, topics, summaries, translation, entities, and image descriptions. Built-in content moderation. MCP-ready for Claude Desktop and AI agents.
Pricing
Pay per event
Rating
0.0
(0)
Developer
ML Boost
Maintained by CommunityActor stats
1
Bookmarked
1
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share
Telegram Scraper + AI Analysis
Scrape any public Telegram channel and get full message metadata — reactions, views, link previews, media URLs, forwards, replies — with optional Gemini AI enrichment for sentiment, topics, summaries, translation, and named entities.
Built for AI agents, data pipelines, and analytics workflows. No Telegram API credentials required — works on public channel previews.
What you get
- 17+ fields per message: plain text, HTML with formatting, ISO timestamps, views as integers, full reactions array (including paid Telegram Stars), media URLs, video duration, link previews, extracted URLs and emails
- Channel metadata: title, description, subscribers, photo/video/link counts, verification status, avatar
- 6 AI enrichments (paid Apify plans only): sentiment, topic tags, one-line summaries, translation, entity extraction, and image descriptions (Gemini vision) — each enabled independently. Free-tier users get full scraping; AI flags are silently skipped with a notice record
- Built-in content moderation via OpenAI (text + image): policy-violating posts are flagged and skipped before any AI call, so you don't pay tokens on abusive content
- Two output formats: JSON (default, full fidelity) or Markdown with YAML frontmatter (drop-in for RAG pipelines)
- Incremental mode: store the last-seen message ID per channel and only fetch new posts on subsequent runs — perfect for Apify Schedules
Quick start
Minimal input:
{"channels": ["durov"],"maxMessagesPerChannel": 200}
With AI enrichment:
{"channels": ["durov", "telegram"],"maxMessagesPerChannel": 500,"enrichSentiment": true,"enrichTopics": true,"enrichSummary": true}
Channel inputs accept any of: durov, @durov, https://t.me/durov, t.me/s/durov.
Pricing (pay-per-event)
| Event | Price | Marketing display |
|---|---|---|
| Actor run start | $0.003 | $0.003 per run |
| Channel info record | $0.001 | $1 per 1,000 channels |
| Message record | $0.003 | $3 per 1,000 messages |
| AI enrichment call | $0.0008 | $0.80 per 1,000 calls |
| Gemini input tokens (per 100) | $0.0003 | $3 per 1M tokens |
| Gemini output tokens (per 100) | $0.0015 | $15 per 1M tokens |
Typical costs
| Scenario | Approx cost |
|---|---|
| 1,000 messages, no AI | ~$3.00 |
| 1,000 messages with sentiment | ~$5.00 |
| 1,000 messages with sentiment + topics | ~$7.00 |
| 1,000 messages with all 5 text enrichments | ~$13.00 |
| 1,000 messages with 1 image each, descriptions only | ~$10.00 |
| 10,000 messages with sentiment (typical run) | ~$50.00 |
AI enrichment is opt-in per flag — pure scraping never triggers any AI cost. Token charges use Google's exact reported usage, with our 5–6× markup priced in.
Input parameters
| Field | Type | Default | Description |
|---|---|---|---|
channels | string[] | required | Channel usernames or t.me/... URLs |
maxMessagesPerChannel | int | 200 | Capped at 5,000 |
outputFormat | enum | json | json / markdown / both |
includeChannelInfo | bool | true | Emit one channel metadata record per channel |
incrementalMode | bool | false | Resume from KV-store cursor |
enrichSentiment | bool | false | {label, score, rationale} |
enrichTopics | bool | false | 1–5 tags from controlled vocabulary |
enrichSummary | bool | false | One-sentence summary |
enrichTranslation | bool | false | Translate to translationTarget |
translationTarget | string | en | Language code (en, es, de, ru, zh, …) |
enrichEntities | bool | false | {companies, people, places, tokens} |
enrichImageDescriptions | bool | false | 1–2 sentence vision description for each image (up to 4/post) |
Output schema
Each dataset record has a recordType field: either channel_info or message.
channel_info record
{"recordType": "channel_info","channel": "durov","title": "Pavel Durov","username": "durov","description": "Founder of Telegram.","subscribers": 11600000,"photos": 96,"videos": 40,"links": 183,"verified": true,"avatarUrl": "https://cdn4.telesco.pe/...","url": "https://t.me/durov"}
message record
{"recordType": "message","channel": "durov","messageId": 489,"url": "https://t.me/durov/489","text": "Group admins can assign tags to users …","textHtml": "<div ...>...</div>","postedAt": "2026-04-15T01:20:29+00:00","views": 2580000,"reactions": [{"emojiId": "5265077361648368841", "count": 23300, "isPaid": false},{"count": 3870, "isPaid": true}],"reactionsTotal": 42280,"isForwarded": false,"isEdited": false,"author": "Pavel Durov","mediaType": "video","mediaUrl": "https://cdn4.telesco.pe/...mp4","videoDuration": "0:09","imageUrls": [],"linkPreview": null,"extractedLinks": [],"extractedEmails": [],"sentiment": {"label": "positive", "score": 0.95, "rationale": "..."},"topics": ["tech", "business"],"summary": "Telegram group admins can now assign user tags…","translation": "Los administradores de grupo pueden…","entities": {"companies": [], "people": [], "places": [], "tokens": []},"imageDescriptions": [{"url": "https://cdn4.telesco.pe/file/…","description": "A smartphone screen displays an audio selection menu …","contains_text": true}]}
AI enrichment fields are only present when their corresponding flag is enabled.
If a post fails moderation, all AI fields are skipped and a moderation field is set:
{"moderation": {"flagged": true, "categories": ["harassment", "violence"]}}
Hard safety caps
These prevent any accidental runaway cost:
- 5,000 messages per channel per run
- 50,000 total messages per run
- 10,000 total AI enrichments per run (safety cap; typical paid runs are well below)
- 2,000-character input clamp before each Gemini call
- 500-token output cap per Gemini call
- 4 images max per post when image descriptions are enabled
- 6 MB max image size for vision processing
- 60 Gemini calls per minute (self-rate-limited)
- 30-minute run timeout
- Posts flagged by OpenAI content moderation skip all AI calls automatically
Use it from AI agents (MCP)
This actor is exposed as a tool via the Apify MCP server — Claude Desktop, Cursor, ChatGPT, n8n, and LangChain can all call it with natural language. The input schema is written so an LLM can pick the right flags without needing additional prompting.
Claude Desktop
Add this to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{"mcpServers": {"apify": {"url": "https://mcp.apify.com?tools=ml_boost/tg-apify-actor","headers": {"Authorization": "Bearer <YOUR_APIFY_TOKEN>"}}}}
Restart Claude Desktop and try: "Scrape the last 100 posts from @durov and give me sentiment and topics for each."
Cursor
In Cursor settings → MCP, add a new server:
- URL:
https://mcp.apify.com?tools=ml_boost/tg-apify-actor - Headers:
Authorization: Bearer <YOUR_APIFY_TOKEN>
Local stdio (npx) — alternative
If you prefer a local wrapper instead of the hosted HTTP endpoint:
{"mcpServers": {"apify": {"command": "npx","args": ["-y", "@apify/actors-mcp-server", "--tools", "ml_boost/tg-apify-actor"],"env": { "APIFY_TOKEN": "<YOUR_APIFY_TOKEN>" }}}}
The hosted endpoint is recommended — it supports output-schema inference, so the agent sees the structured response shape, not just raw JSON.
n8n / LangChain / Zapier
Use the Apify integration node and pick this actor. The input schema renders as a form with sensible defaults; pass channels and toggle enrichments as needed.
ChatGPT Custom GPT
Create a Custom GPT with an Action pointing at https://api.apify.com/v2/acts/ml_boost~tg-apify-actor/run-sync-get-dataset-items?token=YOUR_TOKEN. ChatGPT will read the input schema automatically.
Why this actor is good for AI agents
- Structured output: Every record has stable, documented field names — no string parsing required.
- Opt-in enrichment: Agents enable only the AI fields they need, keeping cost low for simple queries.
- Idempotent incremental mode: Run the same query 1× per hour and only get new posts each time, no duplicates.
- Built-in moderation: The actor refuses to enrich content that fails OpenAI's policy check, so agents don't accidentally launder abusive content through downstream AI calls.
Scheduling
Combine incrementalMode: true with an Apify Schedule (e.g. every 6 hours) to build a real-time monitor for any channel. The actor stores per-channel cursors in its Key-Value store; only new posts since the last run are fetched and charged.
What channels work
Any public Telegram channel with a t.me preview page — that's most of them. The actor will return a clear error record (recordType: "error") for:
- Private channels (no preview available)
- Telegram users or bots (not channels)
- Channels that don't exist
- Groups (out of scope for v1 — use Telegram's Bot API or MTProto for groups)
Limits and known scope
This is v1. Features explicitly deferred to later releases:
- MCP server layer — coming next
- Comments / discussion replies under posts
- Group chats (channels only for now)
- Cross-channel forward deduplication (
forwardChainId) - Member list extraction (not exposed on t.me previews)
- Spam / bot scoring
Support
Leave a review or contact the maintainer via the Apify Store actor page. Reviews and bug reports are read and responded to within 24h.