Telegram Channel Content & Media Scraper avatar

Telegram Channel Content & Media Scraper

Pricing

$10.00 / 1,000 posts

Go to Apify Store
Telegram Channel Content & Media Scraper

Telegram Channel Content & Media Scraper

Extract Telegram posts with media — text, views, reactions, media links, channel name, bio, subscribers, post metadata. Ideal for creators, analysts, data engineers, automation. Updated May 3, 2026 — forwards, hashtags, mentions, channel info added. See README.

Pricing

$10.00 / 1,000 posts

Rating

5.0

(7)

Developer

Olga

Olga

Maintained by Community

Actor stats

21

Bookmarked

644

Total users

76

Monthly active users

7.1 hours

Issues response

16 days ago

Last modified

Share

📦 Telegram Channel Content & Media Scraper

Extract channel metadata, recent posts, views, reactions, forwards, hashtags, mentions, replies, and all media with direct download links. This actor scrapes structured content from public Telegram channels using the Telegram API via a pre-authorized session string — no setup required from you.


✅ What This Actor Does

  • Connects to Telegram automatically (no credentials needed from your side)
  • Parses public channels (by username or link)
  • Retrieves for each channel:
    • Channel metadata: display title, description (about), subscriber count
    • Posts: text, date, post URL, views, reactions (with emoji breakdown + total)
    • Forwards: detects native Telegram forwards and exposes the source (isForward, forwardFromType, forwardFromId, forwardFromName, forwardChannelPost, forwardPostAuthor, forwardDate)
    • Hashtags & mentions: extracted from post text (Unicode-aware — supports Cyrillic, e.g. #вакансия, #Менеджер_по_продажам)
    • Edit metadata: editDate if the post was edited
    • Reply chain: replyToMsgId and replyToPostUrl for posts that reply to another post
    • Links: all URLs found in the post text
    • Media: photos, videos, documents — with direct downloadable URLs (.jpg, .mp4, .pdf, etc.) and mediaTypes
  • Handles:
    • Text-only and media-only messages
    • Telegram FloodWait limits gracefully (with explanatory record in the dataset)
    • Cross-run conflicts via a per-session queue (see "Queue behavior" below)
  • Skips:
    • Empty service messages, outdated posts beyond daysRange
  • Starts from the most recent post and works backwards

🧾 Input

You don't need to enter any credentials — they're already preconfigured.

FieldTypeRequiredDefaultDescription
channelsstringComma-separated list of channel usernames or links (e.g. @channel1, @channel2)
maxPostsnumber10Max number of posts per channel (1–200)
daysRangenumber3Number of days to look back from today (1–30)
includeTextbooleantrueWhether to include post text in the output
mediaOnlybooleanfalseIf true, skip posts that have no media
downloadMediabooleanfalseIf true, download media files into the run's Key-Value Store

📌 Example input

{
"channels": "@channel1, @channel2, @channel3",
"maxPosts": 10,
"daysRange": 3,
"includeText": true,
"mediaOnly": false,
"downloadMedia": true
}

⚠️ Up to 10 channels per run. To avoid Telegram contacts.ResolveUsername FloodWait limits, the actor processes a maximum of 10 channels per run. If you need to scrape more, split into multiple runs.


▶️ How to Run

  1. Open the actor in Apify Console
  2. Fill in the input fields
  3. Click Save, then Run
  4. Results are available in:
    • Dataset — all posts and channel metadata as JSON records
    • Key-Value Store — downloaded media files (when downloadMedia: true)

📤 Output

Each post is saved as a separate record in the default dataset (JSON format).

{
"channel": "@channel_name",
"channelLink": "https://t.me/channel_name",
"channelTitle": "Channel Display Name",
"channelDescription": "Channel about / bio text",
"postId": 101,
"postUrl": "https://t.me/channel_name/101",
"date": "2026-05-03T17:38:00.000Z",
"views": 3456,
"subscribers": 12345,
"text": "Example post content with #hashtag and @mention",
"reactions": [
{ "emoji": "👍", "count": 42 },
{ "emoji": "❤️", "count": 18 }
],
"totalReactions": 60,
"links": ["https://example.com"],
"hasMedia": true,
"mediaUrls": [
"https://api.apify.com/v2/key-value-stores/<storeId>/records/channel_101.jpg?disableRedirect=1"
],
"mediaTypes": ["jpg"],
"isForward": false,
"forwardFromType": "",
"forwardFromId": "",
"forwardFromName": "",
"forwardChannelPost": 0,
"forwardPostAuthor": "",
"forwardDate": "",
"hashtags": ["#hashtag"],
"mentions": ["@mention"],
"editDate": "",
"replyToMsgId": 0,
"replyToPostUrl": ""
}

For forwarded posts, isForward: true and the forward* fields are populated:

{
"isForward": true,
"forwardFromType": "channel",
"forwardFromId": "1009232144",
"forwardFromName": "",
"forwardChannelPost": 181471,
"forwardPostAuthor": "",
"forwardDate": "2026-04-27T15:06:48.000Z"
}

⏳ Queue Behavior (Important for parallel users)

The actor uses a shared Telegram session preconfigured by the author. Telegram allows only one active connection per session at any time — if multiple users run the actor simultaneously, the second connection would get AUTH_KEY_DUPLICATED.

To prevent this, the actor uses a cross-run lock: parallel runs are automatically queued and processed one at a time.

What this means in practice:

  • If no one else is running the actor → starts immediately
  • If another user is currently scraping → your run waits in queue, you'll see logs like:
    Another run is using this Telegram session. Waiting in queue... ttl=120s
  • A typical wait is a few minutes (depends on what the previous run is doing)
  • Maximum queue wait is 15 minutes, after which the run fails with a clear message

If your run waits too long:

  • Just retry later — the queue clears as soon as the busy run finishes
  • This is a hard limit of how Telegram works with shared sessions, not a bug

🛡️ Telegram Rate Limit Handling

If Telegram returns a FloodWait on contacts.ResolveUsername (caused by too many username lookups in a short period), the actor:

  1. Stops processing further channels in this run (to protect the session)
  2. Adds a telegram_rate_limit record to the dataset with the wait duration and a human-readable explanation
  3. You should wait the requested time (usually 10–11 hours) before retrying with the same set of channels — or use fewer channels per run

🛠 Use Cases

  • Competitive content monitoring
  • Telegram market research
  • Dataset building for AI/ML models (with hashtags + mentions for labels)
  • Public media archiving
  • Social listening and trend analysis
  • News aggregation and curation
  • Forward chain analysis (track which posts get republished)

🆕 Updates (2026-05-03)

This release adds significant new fields and improvements:

Channel metadata — every post now includes channelTitle and channelDescription (the channel's About text) ✅ Forwards detection — full set of forward fields (isForward, forwardFromType, forwardFromId, forwardFromName, forwardChannelPost, forwardPostAuthor, forwardDate) ✅ Hashtags extractionhashtags: [] field with Unicode support (works with Cyrillic and other scripts: #работа, #Менеджер_по_продажам) ✅ Mentions extractionmentions: [] field with all @username references in post text ✅ Edit dateeditDate shows when a post was last edited (empty if never edited) ✅ Reply chainreplyToMsgId and replyToPostUrl for posts that reply to another post ✅ Improved cross-run queue — proper lock now works for all users (previously only worked for the actor owner; rented users got conflicts) ✅ Better failure visibility — runs no longer silently succeed with empty output when an internal error occurs; failures are now properly reported with clear messages ✅ Hang prevention — added 15-second timeout on Telegram authentication, preventing rare cases where the actor would hang for many minutes if the session was invalidated ✅ Faster disconnect on errors — disabled autoReconnect, so failed runs end quickly instead of retrying in the background


📜 Changelog

2026-05-03

✅ Added channel metadata (channelTitle, channelDescription) ✅ Added forwards detection (7 new forward* fields) ✅ Added hashtags + mentions extraction ✅ Added editDate and reply chain fields (replyToMsgId, replyToPostUrl) ✅ Cross-run lock now uses author-scoped Apify token, so the queue works for all users (not only the actor owner) ✅ Replaced silent exit handling with explicit Actor.fail() so errors are visible in run status ✅ Added 15-second timeout on Telegram getMe() to prevent multi-minute hangs on dead sessions ✅ Disabled autoReconnect to fail fast and surface real errors

2025-12-05

✅ Added explicit handling and logging for Telegram contacts.ResolveUsername FloodWait errors ✅ When Telegram returns a FloodWait, the actor now stops processing further channels in that run and adds a telegram_rate_limit record to the dataset with a human-readable explanation ✅ Limited the number of channels processed per run to 10 to reduce the risk of hitting global FloodWait on the shared Telegram session ✅ Updated input description to recommend no more than 10 channels per run and documented this behavior in the README

2025-10-10

✅ Fixed AUTH_KEY_DUPLICATED by adding a per-session lock (prevents concurrent reuse of the same Telegram auth key) ✅ Improved error messages for session conflicts ✅ Internal stability tweaks for Telegram requests and media downloads


❓ Troubleshooting

Empty results / 0 items

  • Increase daysRange to 7..14 — the channel may not have posts in the default 3-day window
  • Increase maxPosts to 20..50
  • Make sure the channel is public and the username is correct

Run waits 5+ minutes in queue

  • Another user is currently using the actor; this is normal under load
  • Retry in a few minutes when the queue clears

telegram_rate_limit record in dataset

  • Telegram applied a FloodWait — wait the indicated number of hours before retrying
  • For future runs, use fewer channels per run

Run failed with AUTH_KEY_DUPLICATED

  • Should not happen with the built-in queue, but if it does — wait 10 minutes and retry

⚠️ Telegram Post Grouping Behavior

Telegram's UI sometimes shows a single post as a combination of text + media, but via the Telegram API these are delivered as two separate messages.

You may see two records in the dataset with:

  • The same or very close date timestamps
  • Consecutive postId values
  • One record containing only text, the other containing only media

This is expected behavior. The actor captures exactly what the Telegram API provides without merging.

To reconstruct grouped posts visually, merge adjacent records with close timestamps and consecutive postId.


🔐 License & Usage

This actor is a proprietary paid product.

  • ❗ It is not open source
  • 📦 It is licensed for private and commercial use only by subscribers
  • 🛑 Any unauthorized copying, reselling, or redistribution is strictly prohibited
  • 🧑‍💻 All activity is logged for abuse prevention and fair use enforcement

To get access or report a bug, please contact the creator.


Made with 💙 for Telegram research & automation.