📰 Toutiao Article Detail Scraper avatar

📰 Toutiao Article Detail Scraper

Pricing

$3.00 / 1,000 results

Go to Apify Store
📰 Toutiao Article Detail Scraper

📰 Toutiao Article Detail Scraper

Extract Toutiao article detail data — url, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Pricing

$3.00 / 1,000 results

Rating

0.0

(0)

Developer

Jackie Chen

Jackie Chen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Toutiao Article Detail Scraper (今日头条文章详情)

toutiao-article-detail

Scrape Toutiao (今日头条) article details by article group ID or article URL. For each article it returns clean, structured data: title, source / author, read / comment / digg / like / repost counts, author and media info, and — optionally — the article's HTML content and images.

Unofficial. This Actor is not affiliated with, authorized, or endorsed by Toutiao (今日头条) or ByteDance (字节跳动). It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with Toutiao's terms and all applicable laws; you are responsible for how you use the retrieved data.

What it does

Toutiao articles are addressed by a 19-digit group ID (the ByteDance content id). Give the Actor one or more of these IDs — or full article URLs and it will parse the ID out for you — and it fetches each article's detail:

  • Engagement & metadata — read count, comment / digg / like / repost / bury counts, title, source, canonical URL, content hash.
  • Author / media info — author name, user id, media id, fan count, avatar, verification.
  • Content & images (optional) — when includeContent is on (default), the Actor also calls the web endpoint to add the article's HTML content, titleImage, and image list.

Input

FieldTypeDefaultDescription
groupIdsstring[]["7036185404340437511"]Article group IDs (19-digit) or article URLs (toutiao.com/group/<id>/, toutiao.com/article/<id>/). URLs are parsed to their id automatically.
includeContentbooleantrueAlso fetch the article HTML content + images via the web endpoint (one extra API call per article).
maxItemsinteger50Max total articles to scrape across all IDs.

Example input

{
"groupIds": [
"7036185404340437511",
"https://www.toutiao.com/group/7036185404340437511/"
],
"includeContent": true,
"maxItems": 100
}

Output

One dataset item per article:

{
"groupId": "7036185404340437511",
"title": "全家一起品尝麻辣海鲜汤,软嫩蟹肉搭配美味鱼饼,味道鲜美无比!",
"url": "https://toutiao.com/group/7036185404340437511/",
"displayUrl": "https://toutiao.com/group/7036185404340437511/",
"shareUrl": "https://m.toutiaoimg.cn/i7036185404340437511/",
"source": "兴森一家",
"readCount": 1450943,
"commentCount": 1010,
"diggCount": 12049,
"likeCount": 12049,
"repinCount": 1226,
"buryCount": 1,
"contentHash": "3b08368c",
"author": {
"userId": "1398172224077304",
"mediaId": "1655138971451403",
"name": "兴森一家",
"fansCount": 1501681,
"avatarUrl": "https://...",
"verified": "True",
"description": "..."
},
"content": "<p>...</p>",
"titleImage": "https://p11-sign.toutiaoimg.com/...",
"imageUrls": ["https://..."],
"source_endpoint": "app"
}

Notes

  • Data is sourced live; Toutiao occasionally rate-limits, so the Actor retries transient blocks with exponential backoff and a browser User-Agent.
  • This Actor is ID/URL-driven — there is no keyword search. Collect the group IDs of the articles you want first (the long number in any Toutiao article link).
  • Group IDs are de-duplicated within a run.