# Threads Fast Scraper (`sones/threads-scraper`) Actor

Fast scraper for Meta Threads. Extract profiles, posts, replies, and engagement metrics. HTTP-only, optimized for cost.

- **URL**: https://apify.com/sones/threads-scraper.md
- **Developed by:** [Samy](https://apify.com/sones) (community)
- **Categories:** Social media, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.80 / 1,000 profile scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Threads Scraper

Fast HTTP-only scraper for public Meta Threads data. It extracts profiles, user posts, replies, search users, search threads, and engagement metrics without launching a browser.

### Supported Actions

| Action | Input | Output |
| --- | --- | --- |
| `getProfiles` | `usernames` | Profile metadata for each user |
| `getUserPosts` | `usernames`, optional `sessionId` | Posts/threads from each user |
| `getReplies` | `postUrls` | Replies or the main post for each URL |
| `searchUsers` | `searchQuery` | Matching public usernames |
| `searchThreads` | `searchQuery` | Matching public threads found in search HTML |

### Input

```json
{
    "action": "getUserPosts",
    "usernames": ["zuck", "mosseri"],
    "maxResults": 50,
    "concurrency": 12,
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}
````

For deeper user-post pagination, provide an Instagram/Threads `sessionid` cookie:

```json
{
    "action": "getUserPosts",
    "usernames": ["zuck"],
    "maxResults": 200,
    "sessionId": "YOUR_SESSIONID_COOKIE"
}
```

### Fields

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `action` | enum | Yes | `getProfiles`, `getUserPosts`, `getReplies`, `searchUsers`, or `searchThreads` |
| `usernames` | string\[] | For profiles/posts | Threads usernames, with or without `@` |
| `postUrls` | string\[] | For replies | Full Threads post URLs from `threads.net` or `threads.com` |
| `searchQuery` | string | For search | Search term |
| `maxResults` | integer | No | Max results per user/query. Default: `50`, max: `1000` |
| `concurrency` | integer | No | Parallel targets to process. Default: `12`, max: `25` |
| `sessionId` | string | No | Optional `sessionid` cookie for authenticated mobile API pagination |
| `proxy` | object | No | Apify proxy or custom proxy URLs. Residential proxies are recommended |

### Session ID

`searchThreads`, `searchUsers`, and `getProfiles` are fully public and need **no** Session ID. You only need one for `getReplies` and for deeper `getUserPosts` pagination (beyond ~25 posts).

When you do need it, paste the `sessionid` cookie value (not the whole Cookie header) into the `sessionId` input.

#### How to get your Session ID

1. In a desktop browser, log in to https://www.threads.com (or https://www.instagram.com — they share the cookie).
2. Open DevTools (F12 or right-click then Inspect).
3. Go to the **Application** tab (Chrome/Edge) or **Storage** tab (Firefox).
4. In the sidebar, expand **Cookies** and select the site (e.g. `https://www.threads.com`).
5. Find the row named `sessionid` and copy its **Value**. It looks like `12345678%3AAbCdEf...%3A12%3A...`.
6. Paste that value into the `sessionId` field.

> **Use a throwaway account, not your main one.** Meta can flag, rate-limit, or ban accounts used for scraping. Create a dedicated secondary Threads/Instagram account and use its `sessionid`. Only use a session you are authorized to use. The cookie grants account access, so treat it like a password — the input is stored as a secret.

### Pagination

Public, logged-out scraping is limited by Threads. The logged-out search page returns a single page (~10-25 results) with no working public next-page request, and the logged-out GraphQL/profile paths hit a login wall after ~25 posts.

When `sessionId` is provided, the scraper uses the authenticated mobile API and paginates until `maxResults`, the API stops returning pages, or rate limiting/session expiry occurs.

### Output Examples

#### Profile

```json
{
    "id": "314216",
    "username": "zuck",
    "fullName": "Mark Zuckerberg",
    "bio": "CEO of Meta",
    "followers": 3200000,
    "following": 523,
    "isVerified": true,
    "profilePicUrl": "https://...",
    "threadCount": 1234,
    "url": "https://www.threads.net/@zuck",
    "scrapedAt": "2026-06-26T10:30:00.000Z"
}
```

#### Post

```json
{
    "id": "3456789012345678",
    "shortcode": "CuXyz123",
    "text": "Post content here...",
    "timestamp": "2026-06-26T08:00:00.000Z",
    "likes": 12500,
    "replies": 340,
    "reposts": 890,
    "author": {
        "username": "zuck",
        "fullName": "Mark Zuckerberg",
        "profilePicUrl": "https://...",
        "isVerified": true
    },
    "mediaUrls": ["https://..."],
    "url": "https://www.threads.net/@zuck/post/CuXyz123",
    "scrapedAt": "2026-06-26T10:30:00.000Z"
}
```

### Development

```bash
bun install
bun test
bun run typecheck
bun run lint
```

### Notes

- Residential proxies are recommended because Meta frequently blocks datacenter IPs.
- The scraper retries rate limits and transient server errors with exponential backoff.
- Private accounts and login-only content are not available in public mode.
- Threads internal endpoints can change without notice; keep parser tests current when response shapes change.

# Actor input Schema

## `action` (type: `string`):

Choose one workflow. Fill only the target field used by that workflow.

## `usernames` (type: `array`):

Used by Get user posts and Get profiles. Add one Threads username per row; @ is optional. Start with a mix of active public accounts for better sample output.

## `postUrls` (type: `array`):

Used by Get replies. Add full Threads post URLs from threads.net or threads.com. Reply pagination requires Session ID.

## `searchQuery` (type: `string`):

Used by Search users and Search threads.

## `maxResults` (type: `integer`):

For user posts, this applies per username. Public user-post scraping usually stops around 25 results without Session ID.

## `concurrency` (type: `integer`):

How many targets to process in parallel. Increase with strong residential proxies; reduce if you see rate limits.

## `sessionId` (type: `string`):

Only needed for Get replies and deeper user-post pagination. Search and profile actions do not need it. Paste the sessionid cookie value (not the full Cookie header). Use a throwaway Threads/Instagram account, not your main one. See the README for how to copy it from your browser's DevTools.

## `proxy` (type: `object`):

Residential proxies are recommended because Meta often blocks datacenter IPs.

## Actor input object example

```json
{
  "action": "getUserPosts",
  "usernames": [
    "zuck",
    "mosseri",
    "instagram",
    "threads",
    "nasa",
    "natgeo",
    "nba",
    "f1",
    "openai",
    "github",
    "vercel",
    "spotify"
  ],
  "searchQuery": "startup funding",
  "maxResults": 25,
  "concurrency": 10,
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "usernames": [
        "zuck",
        "mosseri",
        "instagram",
        "threads",
        "nasa",
        "natgeo",
        "nba",
        "f1",
        "openai",
        "github",
        "vercel",
        "spotify"
    ],
    "searchQuery": "startup funding",
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("sones/threads-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "usernames": [
        "zuck",
        "mosseri",
        "instagram",
        "threads",
        "nasa",
        "natgeo",
        "nba",
        "f1",
        "openai",
        "github",
        "vercel",
        "spotify",
    ],
    "searchQuery": "startup funding",
    "proxy": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("sones/threads-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "usernames": [
    "zuck",
    "mosseri",
    "instagram",
    "threads",
    "nasa",
    "natgeo",
    "nba",
    "f1",
    "openai",
    "github",
    "vercel",
    "spotify"
  ],
  "searchQuery": "startup funding",
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call sones/threads-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=sones/threads-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Threads Fast Scraper",
        "description": "Fast scraper for Meta Threads. Extract profiles, posts, replies, and engagement metrics. HTTP-only, optimized for cost.",
        "version": "1.1",
        "x-build-id": "7Sh23p3xFX7dEw1Hc"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/sones~threads-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-sones-threads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/sones~threads-scraper/runs": {
            "post": {
                "operationId": "runs-sync-sones-threads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/sones~threads-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-sones-threads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "action"
                ],
                "properties": {
                    "action": {
                        "title": "Action",
                        "enum": [
                            "getUserPosts",
                            "getProfiles",
                            "getReplies",
                            "searchUsers",
                            "searchThreads"
                        ],
                        "type": "string",
                        "description": "Choose one workflow. Fill only the target field used by that workflow.",
                        "default": "getUserPosts"
                    },
                    "usernames": {
                        "title": "Usernames",
                        "type": "array",
                        "description": "Used by Get user posts and Get profiles. Add one Threads username per row; @ is optional. Start with a mix of active public accounts for better sample output.",
                        "items": {
                            "type": "string"
                        },
                        "default": [
                            "zuck",
                            "mosseri",
                            "instagram",
                            "threads",
                            "nasa",
                            "natgeo",
                            "nba",
                            "f1"
                        ]
                    },
                    "postUrls": {
                        "title": "Post URLs",
                        "type": "array",
                        "description": "Used by Get replies. Add full Threads post URLs from threads.net or threads.com. Reply pagination requires Session ID.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQuery": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Used by Search users and Search threads.",
                        "default": "artificial intelligence"
                    },
                    "maxResults": {
                        "title": "Max results per target",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "For user posts, this applies per username. Public user-post scraping usually stops around 25 results without Session ID.",
                        "default": 25
                    },
                    "concurrency": {
                        "title": "Concurrency",
                        "minimum": 1,
                        "maximum": 25,
                        "type": "integer",
                        "description": "How many targets to process in parallel. Increase with strong residential proxies; reduce if you see rate limits.",
                        "default": 10
                    },
                    "sessionId": {
                        "title": "Session ID (optional)",
                        "type": "string",
                        "description": "Only needed for Get replies and deeper user-post pagination. Search and profile actions do not need it. Paste the sessionid cookie value (not the full Cookie header). Use a throwaway Threads/Instagram account, not your main one. See the README for how to copy it from your browser's DevTools."
                    },
                    "proxy": {
                        "title": "Proxy",
                        "type": "object",
                        "description": "Residential proxies are recommended because Meta often blocks datacenter IPs.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
