Yandex Zen Scraper
Pricing
from $20.00 / 1,000 results
Yandex Zen Scraper
Yandex Zen (Dzen) scraper : extract public stats from articles and videos in bulk: views, likes, comments, author info, subscribers. Paste a list of URLs.
Pricing
from $20.00 / 1,000 results
Rating
0.0
(0)
Developer
Sasha Ebashu
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
What does Yandex Zen Scraper do?
Yandex Zen Scraper extracts public stats from Yandex Zen / Dzen publications without an API key or login. Paste a list of links (articles or videos) and it returns clean, structured data for every one — built for bulk runs.
Yandex Zen Scraper can extract:
- View counts
- Likes
- Comment counts
- Engagement rate (calculated)
- Author / channel name, ID, profile URL and subscriber count
- Title, category and publish date
It works for articles (/a/…), posts (/b/…), videos
(/video/watch/…) and shorts (/shorts/…).
How does it work?
Unlike many sites, Dzen publication pages are gated behind Yandex sign-in for plain HTTP requests. Dzen Scraper therefore uses a real headless browser (Playwright) to render each page anonymously — exactly like a normal reader — and reads the stats from the rendered page. No Yandex account is needed.
Because Yandex aggressively blocks datacenter IPs with SmartCaptcha, the scraper uses Apify Residential Proxy (country RU) by default.
How to use it
- Click Start
- Paste your Dzen links into the Yandex Zen URLs field — one per line, or a whole blob separated by spaces/commas/new lines (they are split automatically). You can also upload a file or link a Google Sheet via the second field.
- Click Run
- When the run finishes, preview or download your data from the Output tab (JSON, CSV, Excel)
Supported URL formats (query strings ?… and anchors #… are ignored):
https://dzen.ru/a/<id>— articleshttps://dzen.ru/b/<id>— posts (short text publications)https://dzen.ru/video/watch/<id>— videoshttps://dzen.ru/shorts/<id>— shorts
Duplicate URLs are removed automatically, so bulk lists with repeats are fine.
Input parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
| urls | array | Dzen URLs, bulk paste (one per line or a separated blob) | — |
| startUrls | array | Alternative bulk input: list / file / Google Sheet | — |
| maxItems | integer | Max publications per run (1–5000) | 1000 |
| maxConcurrency | integer | Parallel browser pages (keep 1–3) | 2 |
| proxyConfiguration | object | Proxy settings | Residential RU |
Proxy note. Yandex blocks datacenter IPs quickly, so Residential proxy (RU) is enabled by default. You can turn it off, but expect SmartCaptcha on anything beyond a few requests.
Results
One JSON object is returned per input URL.
Example output:
[{"inputUrl": "https://dzen.ru/a/afRiHXixAFfUTmXO","platform": "dzen","contentId": "afRiHXixAFfUTmXO","contentType": "article","contentUrl": "https://dzen.ru/a/afRiHXixAFfUTmXO","title": "Овощи, которые стоит сажать рядом друг с другом 🍅🌿","author": "Садовый клуб. Вероника Поливкина","authorId": "5ddebbefc429860a2f8e4c3c","authorUrl": "https://dzen.ru/id/5ddebbefc429860a2f8e4c3c","authorSubscribers": 86128,"views": 16158,"likes": 263,"dislikes": null,"comments": 3,"reposts": null,"favorites": null,"engagementRate": 1.65,"category": "Садоводство","publishedAt": "2026-05-06T05:30:37.272Z","scrapedAt": "2026-06-14T12:08:19.363Z","status": "success","error": null}]
Field description
| Field | Description |
|---|---|
| inputUrl | Original Dzen URL provided as input |
| platform | Always "dzen" |
| contentId | Dzen publication id |
| contentType | article, post, video or short |
| contentUrl | Canonical publication URL |
| title | Publication title |
| author | Channel / author name |
| authorId | Channel id |
| authorUrl | Link to the channel |
| authorSubscribers | Channel subscriber count (see caveat below) |
| views | View count |
| likes | Like count |
| dislikes | Not exposed publicly by Dzen → always null |
| comments | Comment count |
| reposts | Not exposed publicly by Dzen → always null |
| favorites | Not exposed publicly by Dzen → always null |
| engagementRate | (likes + comments) / views × 100 |
| category | Category / tag |
| publishedAt | Publish timestamp |
| scrapedAt | Timestamp of when the data was collected |
| status | success, unavailable or failed |
| error | Reason string when not successful, otherwise null |
Which metrics are available?
| ✅ Available | 🚫 Not available |
|---|---|
| views, likes, comments, title, author, authorId, authorUrl, category, publishedAt | dislikes, reposts, favorites — Dzen does not expose these publicly (always null) |
Special cases & caveats
- Deleted / private / unavailable publication →
status: "unavailable",error: "likely_deleted". - Unrecognised URL →
status: "failed",error: "invalid_url". - Blocked by Yandex (captcha/SSO) → the request is retried with a fresh proxy
session; if it keeps failing the item is marked
failed. - Subscriber count is accurate for both articles and videos. (For video/short pages the count is fetched from the public channel feed API, because the player embeds recommended channels on those pages.)
- Billing: unavailable/failed items go to a separate ERRORS dataset (Storage → Datasets → ERRORS), not the main output — so you are not charged for deleted/private/invalid URLs.
Notes
- Built with the Apify SDK + Crawlee + Playwright (headless Chromium).
- Endpoints and extraction are documented in ./RESEARCH.md.
- Only public data is collected — do not scrape personal data without a legitimate reason.