Meta Threads Scraper — CSV, No Login, No Rate Limits avatar

Meta Threads Scraper — CSV, No Login, No Rate Limits

Pricing

Pay per usage

Go to Apify Store
Meta Threads Scraper — CSV, No Login, No Rate Limits

Meta Threads Scraper — CSV, No Login, No Rate Limits

Meta Threads (threads.net) data as JSON/CSV — POSTs (author, text, source) + PROFILEs (followers, biography, avatar) by username/search. 23+ runs. For audience research + brand mentions + competitor content. No API waitlist. Custom fork: spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex

Alex

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

2

Monthly active users

6 days ago

Last modified

Share

Threads Scraper — Profiles + Posts from Public Threads Pages (No Login, No API Waitlist)

Pull Threads profile metadata and post text from public Threads pages — without a Meta developer account, without the Threads API waitlist, and without browser automation. The actor reads what a logged-out visitor sees: the server-rendered HTML and embedded JSON blob.

Who this is for: PR teams tracking brand mentions. Marketing teams sizing up influencer partners. Market researchers benchmarking competitor content cadence. Founders watching a niche community grow in real time.


What you actually get

The actor pushes two record types to the dataset (verified against src/main.js):

_type: "PROFILE" — one per username

Two parsing paths, depending on whether Meta ships the embedded JSON blob in this run:

PathFields pushedTriggered when
A — JSON regex_type, username, followers, following, biography, url, scrapedAtThe page contains the React/Next.js SSR JSON with follower_count / following_count
B — Open Graph fallback_type, username, displayName, description, avatar, url, scrapedAtJSON regex fails — OG meta tags only

The two paths are mutually exclusive per record. If you need follower counts specifically, expect occasional records that fall through to Path B (no follower count). If your analysis depends on follower counts, drop Path B records or re-run.

_type: "POST" — one per extracted post

{
"_type": "POST",
"author": "zuck",
"text": "Threads passes 200M monthly users.",
"source": "profile:zuck",
"scrapedAt": "2026-04-29T12:00:00.000Z"
}

source is profile:<username> for posts harvested from a profile page, or search:<query> for posts harvested from the search endpoint.

Honest disclosure on what's NOT extracted: likes, replies, reshares, post timestamps, post URLs, image attachments, and quoted-thread context are not parsed. Only text and author. Meta exposes these counts inconsistently on logged-out pages — if you need engagement metrics at scale, see Custom scraping below.


Input

{
"usernames": ["zuck", "mosseri"],
"searchQueries": ["AI agents"],
"maxPostsPerSource": 50
}
ParameterTypeDefaultDescription
usernamesArray[]Threads handles without @. The @ prefix and threads.net/ are stripped.
searchQueriesArray[]Keywords searched against threads.net/search.
maxPostsPerSourceNumber50Cap per profile or query.

maxConcurrency=3, maxRequestsPerCrawl=200, requestHandlerTimeoutSecs=30. No proxy is used — direct fetches over CheerioCrawler default agent.


Common questions

Q: Will this trigger Apify or Meta abuse flags? A: The actor only hits publicly accessible Threads URLs (the same pages a logged-out visitor sees) and parses the server-rendered JSON already embedded in the HTML. No login, no auth tokens, no private endpoints. That said: any large-volume crawler against Meta surfaces eventually rate-limits. We've seen clean runs at the default maxConcurrency=3; bursting harder is on you.

Q: What if Meta changes the page layout? A: Two parsers run in sequence — (1) regex against the embedded React/Next.js JSON, (2) Open Graph meta tags as fallback. When one layer breaks, the other usually still returns at least the username + display-name pair. Email if both layers stop returning data and we'll patch within a session.

Q: Can I get likes / replies / reshares? A: Not from this actor — engagement counts are inconsistently rendered on logged-out Threads pages, and the regex extraction here only captures text. For guaranteed engagement metrics at scale, request a custom build (see below).

Q: Bulk run cost? A: Apify charges by compute units, not per-profile. Each profile run is a single CheerioCrawler request (~1-3 seconds). Free tier covers small batches. For large batches (1000+ profiles), email and we can quote a fixed-price custom build instead.


Honest Limitations (regex extraction edge cases)

Both parsing paths use regex against the embedded JSON blob, not a JSON parser. Known consequences:

  • Posts shorter than 2 chars or longer than 500 chars are silently dropped. The regex range [^"]{2,500} is a deliberate noise filter, but it does drop one-emoji posts and long-form posts.
  • Biographies / post text with escaped double-quotes (\") get truncated at the first \". Only \n is decoded back to newline; other JSON escape sequences (\", \u0000, \\) are left as-is or break the match.
  • Search branch regex "text":"([^"]{10,500})".*?"username":"([^"]+)" uses lazy .*? to bridge text → username. On dense JSON pages this can occasionally cross a record boundary and pair text with the WRONG username. If you see suspicious pairings, drop them by deduping on text + author and re-running.
  • The PROFILE record's POST extraction matches "text_post_app_info":{...}.*?"text":"([^"]{2,500})" — the .*? similarly can drift across fields. On well-structured pages this is fine; on malformed/partial responses it may capture unrelated text values.
  • Search-branch records do NOT include profile metadata — only _type, text, author, source, scrapedAt (5 fields). If you need follower counts for search-result authors, run a second pass through the profile branch with the deduped author list.
  • No proxy. If Threads escalates anti-bot on a particular IP range, the actor returns 0 records silently — no flag, no error. Re-run from a fresh Apify run (which uses a different egress IP) or commission a custom build with residential-proxy routing.

Step-by-step

  1. Open Threads Scraper → click "Try for free".
  2. Paste usernames: ["zuck", "mosseri"].
  3. Click Start → download JSON / CSV when the run finishes.

For programmatic use:

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("knotless_cadence/threads-scraper").call(
run_input={"usernames": ["zuck", "mosseri"], "maxPostsPerSource": 30}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item["_type"] == "PROFILE":
print(item["username"], "→", item.get("followers"), "followers,", item.get("biography") or item.get("description"))
else:
print(f" POST by {item['author']}: {item['text'][:80]}")

PlatformTool
Threads (this tool)Meta's text-post network
Reddit DiscussionCommunity discussions, public JSON API
Bluesky ScraperAT Protocol, open API
YouTube CommentsVideo audience reactions
Hacker NewsDeveloper sentiment

All 31 published actors free to inspect on Apify Store.


Proof of delivery

24 lifetime runs on this actor — but the broader portfolio is what backs every pilot:

  • 31 published / 78 total Apify scrapers across socials, B2B, dev tools.
  • Flagship: Trustpilot Review Scraper951 lifetime runs, 0 bot-detection failures across 30 days.
  • Recent paid series: $150 / 3-article postmortem for a client in the proxy industry (March 2026, delivered).
  • Code-honest READMEs: every claim in this readme is verified against src/. No "supports X" without proof.

Pilot pricing locked through May 2026:

  • 1 case-study article (1100w+, code blocks): $50
  • 3-article series: $150
  • Custom build (this actor → your variant: follower-list pulls, comment trees, multi-source enrichment with Instagram + Bluesky): from $50 depending on schema delta.

Reply sample to spinov001@gmail.com — get 2 published case-study articles within 24h. No commitment.


Custom scraping — pilot tiers

Need engagement metrics, multi-platform fan-out, or a different schema? Three tiers:

  • Pilot — $97 · 1 actor, basic config, 7-day support. Good entry point — useful for a single Threads + Instagram fan-out or a one-off competitor cadence report.
  • Standard — $297 · custom actor + Slack/email alerts on results, 30-day support. Most social-listening projects fit here.
  • Premium — $797 · custom actor + dashboard + 90-day support + 1 modification round. For ongoing pipelines (daily competitor sentiment, multi-source enrichment with Instagram + Threads + Bluesky).

Email: spinov001@gmail.com — drop specs, schema, or target handles and get a quote within 48h.

Proof of work: 31 published Apify scrapers (78 total in portfolio) — Trustpilot 951 / Reddit 82 / Google News 45 / Glassdoor 39 / Email Extractor 107 / Hacker News 27 / Bluesky 25. Recently delivered a paid 3-article series for a client in the proxy industry ($150).

More tips: t.me/scraping_ai · blog.spinov.online


Disclaimer

Designed for market-research, brand-monitoring, and academic use. Respect Threads' / Meta's Terms of Service, applicable data-protection law (GDPR, CCPA), and scrape publicly visible content only. Not affiliated with Meta Platforms, Inc.

Honest disclosure: extracts only text + author for posts and username + (followers/following/biography) | (displayName/description/avatar) for profiles — engagement counts, post URLs, post timestamps, and image attachments are not in the schema. Two parsing paths run with mutually exclusive output fields per profile record.