Pricing

Pay per event

Threads Reply Scraper — Conversation Graph

Export the full reply tree of any public Threads post — no Meta login — as a conversation graph plus engagement counts, to JSON or CSV. A Threads post scraper built on the threads.net SSR HTML payload. We retry and rotate so the thread lands.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Threads Reply Scraper — Full Conversation Graph Export

We do the dirty work so your dataset stays clean. 😈

$5.05 / 1,000 rows — Export the full reply tree of any public Threads (threads.net) post. No Meta account. No API key. Every visible reply chain, depth-linked, with engagement counts on every node — ready for conversation-graph analysis, brand triage, or NLP research.

Existing Threads scrapers on the Apify Store cap at ~20 posts per profile with no reply-tree expansion. This Actor goes the opposite direction: pick one post, get the whole conversation underneath it, parent-pointed and depth-indexed.

🎯 What this scrapes

You pass one or more Threads post URLs. For each post, this Actor:

Fetches https://www.threads.net/@{username}/post/{code} with a real browser fingerprint.
Extracts the server-rendered conversation payload Threads embeds inside <script type="application/json" data-sjs> blocks.
Walks the payload's edges -> thread_items tree and emits one flat row per node — the root post plus every reply Threads inlined into the initial HTML, including nested chains (depth 2+).

Every row carries a row_type discriminator ("post" for the root, "reply" for everything else), a depth integer (0 for root, 1 for direct replies, 2+ for nested chain replies), and a parent_reply_id pointer so consumers can reconstruct the conversation graph with a single LEFT JOIN.

This is the threads.net scraper that goes where the official Threads API won't: third-party conversation trees, no developer-account review required.

Field	Type	Description
`row_type`	string enum	`"post"` for the root, `"reply"` for everything else
`root_post_id`	string	Threads `pk` of the input post this row belongs to
`root_post_url`	string	Input URL (normalized) — same value on every row from one input
`parent_reply_id`	string \| null	`null` for root; root `pk` for direct replies; predecessor `pk` for nested
`reply_id`	string	This node's `pk`
`reply_url`	string	Public Threads URL of this node
`reply_text`	string	Caption body (empty string if media-only)
`author_username`	string	Author handle
`author_display_name`	string \| null	Author full name
`author_user_id`	string \| null	Internal Threads user `pk`
`author_followers`	integer \| null	Follower count when Threads surfaces it (usually only root author)
`posted_at`	string	ISO 8601 UTC timestamp
`like_count`	integer	Likes at scrape time
`reply_count`	integer	Direct reply count at scrape time
`repost_count`	integer	Repost count at scrape time
`quote_count`	integer	Quote-post count at scrape time
`depth`	integer	`0`=root, `1`=direct, `2+`=nested chain
`scraped_at`	string	ISO 8601 UTC when the row was written

🔥 Features

Full reply tree per post — root + every reply chain Threads server-renders into the initial HTML, including nested depth-2/3+ chains.
Depth-linked output — depth + parent_reply_id let you rebuild the conversation graph trivially in SQL, pandas, or networkx.
Engagement counts on every node — likes, direct replies, reposts, and quote-posts captured per post and per reply.
We rotate browser fingerprints — curl-cffi Chrome 131 TLS + HTTP/2 impersonation so the target sees a real browser, not Python. Fingerprint profiles cycle per session.
We rotate residential proxies — BUYPROXIES94952 residential pool is on by default; fresh session ID on every block. Meta bans datacenter IPs within minutes; we route around it.
We retry with exponential backoff — up to 5 attempts per URL on 408 / 429 / 5xx with Retry-After honoured. You get results, not empty datasets.
Per-post cost control — maxDepth (1–10) and maxRepliesPerNode (1–500) caps so you pay exactly for what you need.
Pydantic v2 input validation — bad URLs, empty lists, and out-of-range caps fail fast before any network call, not after you've paid for a run.
Clean typed rows — ISO 8601 timestamps, stable pk-based IDs, nullable fields declared — no surprise nulls or mixed types in your dataset.

💡 Use cases

Brand reputation monitoring — pull the entire reply pile-on under a viral brand mention and triage by like count or reach in 2 minutes, not 90.
Crisis communications — export every visible reply to a controversial post for PR review without manual scrolling.
Social-listening dashboards — feed conversation-graph rows into Slack, Looker, Tableau, or Hex for real-time sentiment tracking on Threads.
Competitive intelligence — track reply sentiment under competitor product launches on Threads.
Creator analytics — see which of your own replies sparked sub-conversations vs which died after one comment.
Academic research — bootstrap conversation-tree datasets for NLP and argument-mining models from public Threads discussions.
Meta-policy research — measure conversation topology on policy-adjacent posts (fanout, nested-debate sub-threads, engagement decay).
OSINT investigation — track public discussion threads around named events or accounts.

⚙️ How to use it

Open the Actor input form.
Paste one or more Threads post URLs into Threads post URLs — e.g. https://www.threads.net/@mosseri/post/DYX3oNcAO4r. Up to 50 URLs per run.
Set Maximum reply depth to control how deep into nested chains the Actor goes (default 3).
Set Max top-level reply threads per post to cap how many direct replies are exported per post (default 50).
Leave Use Apify Proxy ON — Meta blocks datacenter IPs within minutes. We handle the rotation.
Click Start. Results stream into the default dataset in real time and are downloadable as JSON, CSV, Excel, or XML.

Finding a Threads post URL

Open any public post on threads.net or in the Threads app and copy the share link. The URL format is https://www.threads.net/@{username}/post/{code}. Trailing query strings (?xmt=...) and fragments are stripped automatically — paste the raw share URL straight in.

📥 Input

Field	Type	Required	Default	Description
`postUrls`	array of strings	yes	—	1–50 Threads post URLs
`maxDepth`	integer	no	`3`	Reply-depth cap (1–10)
`maxRepliesPerNode`	integer	no	`50`	Top-level reply cap per post (1–500)
`useProxy`	boolean	no	`true`	Route via BUYPROXIES94952 residential

Single-URL example

{
  "postUrls": [
    "https://www.threads.net/@mosseri/post/DYX3oNcAO4r"
  ],
  "maxDepth": 3,
  "maxRepliesPerNode": 50,
  "useProxy": true
}

Batch example

{
  "postUrls": [
    "https://www.threads.net/@mosseri/post/DYX3oNcAO4r",
    "https://www.threads.net/@threads/post/AAA",
    "https://www.threads.net/@zuck/post/BBB"
  ],
  "maxDepth": 2,
  "maxRepliesPerNode": 25,
  "useProxy": true
}

📤 Output

One flat dataset row per post or reply. Use row_type to filter to roots only, or depth to filter to direct replies or specific nested layers.

{
  "row_type": "post",
  "root_post_id": "3897828658278100523",
  "root_post_url": "https://www.threads.net/@mosseri/post/DYX3oNcAO4r",
  "parent_reply_id": null,
  "reply_id": "3897828658278100523",
  "reply_url": "https://www.threads.net/@mosseri/post/DYX3oNcAO4r",
  "reply_text": "Does DMing people back help with reach?",
  "author_username": "mosseri",
  "author_display_name": "Adam Mosseri",
  "author_user_id": "63482099442",
  "author_followers": null,
  "posted_at": "2026-05-15T13:36:48+00:00",
  "like_count": 427,
  "reply_count": 98,
  "repost_count": 12,
  "quote_count": 2,
  "depth": 0,
  "scraped_at": "2026-05-16T12:00:00+00:00"
}

Export formats

After a run completes, click Export in the Apify Console for JSON (full fidelity), CSV (flat — ideal for spreadsheets), Excel (.xlsx), or XML. All formats are available via GET /datasets/{id}/items?format=csv&clean=true on the Apify REST API.

Reconstructing the conversation tree

In SQL:

SELECT parent.reply_text  AS parent_text,
       child.reply_text   AS reply_text,
       child.depth,
       child.like_count
FROM   rows child
LEFT JOIN rows parent
  ON   child.parent_reply_id = parent.reply_id
WHERE  child.root_post_id = '3897828658278100523'
ORDER BY child.depth, child.like_count DESC;

In pandas: df.merge(df, left_on="parent_reply_id", right_on="reply_id", suffixes=("", "_parent")).

💰 Pricing

Pay-Per-Event (PPE) — you pay only for what you scrape. No result → no charge beyond the small start fee.

Event	Price (USD)	When
`actor-start`	$0.05	Once per run, at boot
`result-row`	$0.005	Per post or reply row written

Example costs

Rows scraped	Actor starts	Total cost
100	1	$0.55
500	1	$2.55
1,000	1	$5.05
5,000	1	$25.05

A typical run on a single mid-sized post (root + ~50 direct replies + a handful of nested chains) emits 60–120 rows, costing $0.35–$0.65.

🚧 Limitations

No "Show replies" pagination. Threads renders some deeply nested replies behind a "Show replies" click that triggers a client-side XHR. This Actor emits exactly what threads.net serves in the initial HTML — typically the root, all direct replies, and any inline-expanded depth-2/3 chains Threads already included. Deeper hidden replies require their own XHR and are not fetched in this version.
No reposter user list. Threads renders the /reposts/ sub-page entirely client-side from a private endpoint using rotating tokens. Repost counts are captured on every row (repost_count), but the list of accounts that reposted is out of scope.
No quote-post bodies. quote_count is captured per row; the bodies of quote-posts referencing the input are not included.
No media (images, videos). Only reply_text is captured. Image ALT text, video transcripts, and external link cards are not extracted.
Private profiles / login-walled posts return zero rows. If the page returns a login wall instead of the conversation payload, the Actor logs a WARNING and skips that URL. Enable useProxy to maximise success rate.
Very large batches may encounter rate-limit windows. With residential proxy and per-URL session rotation the Actor handles single-post scrapes reliably. Batches larger than 20 posts in one run may trigger short pauses — we retry with exponential backoff up to 5 attempts per URL.
Not real-time. The Actor reads what threads.net serves in its current SSR HTML. Replies posted seconds before the scrape may not yet be inlined in that snapshot.
Apify FREE plan retains run-scoped storage for 7 days only. Export your dataset immediately after the run or use a named dataset to retain longer.
ToS responsibility. Meta's Terms of Service prohibit scraping. The threads.net post URL is publicly accessible without login, but you remain responsible for verifying your jurisdiction's data-protection rules and Meta's current Terms before using scraped data commercially.

❓ FAQ

Do I need a Threads or Instagram account?

No. The Actor fetches threads.net directly with a real Chrome browser fingerprint. No Meta login, no API key, no OAuth flow.

Is this a Meta Threads API alternative?

It is complementary to the official API. Meta's Threads API is gated behind a developer-account review and exposes only the post owner's own data — it does not support third-party conversation-tree reads. This Actor reads the same public SSR HTML any browser renders when visiting a threads.net post URL. Use both where each fits.

Does this work as a threads.net scraper for any public post?

Yes, as long as the post is reachable at its public URL without a login wall. Private accounts, deleted posts, and posts behind an age-gate return zero rows and a clear status message.

How deep into the reply tree does this go?

By default, depth 3 — root post (depth 0), direct replies (depth 1), and the first two layers of inline-expanded nested replies Threads embeds in the initial HTML (depth 2 and 3). Increase maxDepth up to 10 if you need every embedded chain.

Why isn't the reposter list included?

Threads' /reposts/ sub-page loads its user list via an internal client-side request with rotating tokens. Implementing that would create constant breakage as the tokens rotate. Repost counts are still captured on every row.

Why is residential proxy on by default?

Meta blocks repeated requests from the same datacenter IP within minutes. The residential pool rotates IPs per URL — that is the difference between consistent results and an empty dataset. We manage the rotation so you don't have to.

What happens if Meta blocks a request?

We retry with exponential backoff — up to 5 attempts per URL. If all retries fail or the page returns a login wall, that URL is skipped with a WARNING log and the run continues. If every URL fails, the Actor exits non-zero with a clear status message so you always know what happened.

Can I rebuild the conversation graph from the output?

Yes — every row carries parent_reply_id, so a single LEFT JOIN on reply_id reconstructs the tree. See the Output → Reconstructing the conversation tree section above for SQL and pandas examples.

Can I download threads replies as a spreadsheet?

Yes. After the run completes, click Export and choose CSV or Excel. The flat row-per-reply structure maps directly to a spreadsheet without any transformation needed.

💬 Your feedback

Found a bug, hit a rate-limit pattern, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.

Meta threads scraper

curious_coder/threads-scraper

Scrape threads or posts from meta or instagram's new social media website "threads.net"

Curious Coder

1.1K

1.7

Threads Replies Scraper

igview-owner/threads-replies-scraper

Scrape public replies and comments from Meta Threads users. Get reply text, user info, engagement metrics and media URLs in clean JSON/CSV for analytics and research.

Sachin Kumar Yadav

109

4.0

Threads Post Scraper

logical_scrapers/threads-post-scraper

Threads Post Scraper extracts complete data from any public Threads.net post, including caption text, media, engagement metrics, author details, and all replies. giving you accurate and structured output ideal for research, analytics, monitoring conversations, and building datasets at scale.

Goldmine

1.1K

4.1

Threads Scraper

magicfingers/threads-scraper

Scrape Threads (by Meta) public data: user profiles, posts/threads, replies, and keyword search. Extracts follower counts, engagement metrics, media URLs, and more.

abdulrahman alrashid

152

Threads Scraper

automation-lab/threads-scraper

Scrape Meta Threads posts, profiles, and search results. No login needed. Extract text, likes, reply counts, reposts, media, and user data.

Stas Persiianenko

1.4K

4.8

Threads Scraper - Posts, Profiles & Replies

pro100chok/threads-scraper-monthly

All-in-one Threads.net scraper: search posts by keyword, extract user profiles with followers, and collect reply threads. Full engagement data, contact extraction, media URLs. 20 parallel workers, auto-retry, proxy rotation.

Raven

5.0

Threads Scraper - Posts, Profiles & Replies

pro100chok/threads-scraper-usage

Raven

272

4.9

Threads Replies Scraper

futurizerush/threads-replies-scraper

Extract public Threads post replies and comments with author info, available engagement metrics, media details, quote/repost signals, and verification status. The original post row includes public view count when available. Batch multiple posts with no login required.

Rush

655

5.0

Threads

canadesk/threads

Collect or Search for Posts and User profiles on Threads. It's fast and costs little!

Canadesk Support

371

1.0

Threads Scraper - Posts, Profiles & Comments

vitalue/threads-scraper

Scrape Threads (threads.net) posts, user profiles, and comments. Search by keyword, scrape user data (bio, followers, verified status), posts with engagement metrics, and comments.