Apify X Profile Period Scraper Actor
Pricing
from $0.00005 / actor start
Go to Apify Store
Apify X Profile Period Scraper Actor
X Profile Period Scraper Actor
A production-focused Apify Actor that scrapes posts from X (Twitter) profile URLs for a specific date range, with strict ownership validation and scalable multi-profile support.
Why this actor exists
Most profile scrapers fail in at least one of these areas:
- weak input validation
- cross-profile leakage in results
- slow collection for larger date windows
- unstable links that can redirect unexpectedly
This actor is built to solve those problems cleanly.
Core capabilities
- Scrape one profile or many profiles in one run
- Enforce strict date filtering in UTC (
fromDatetotoDate) - Validate profile ownership before storing a post
- Add canonical redirect-safe link per post:
https://x.com/i/status/<id> - Capture rich post metadata (author, metrics, media, entities)
- Include media-derived
thumbnailin JSON output - Run profiles concurrently for high throughput
Input
Required:
fromDate(YYYY-MM-DD)toDate(YYYY-MM-DD)- and at least one of:
profileUrlprofileUrls[]
Optional:
profileConcurrency(default3)maxPosts(per profile,0= unlimited)maxScrollsscrollDelayMsproxyConfiguration
Example:
{"profileUrls": ["https://x.com/narendramodi", "https://x.com/jack"],"fromDate": "2026-01-01","toDate": "2026-02-01","profileConcurrency": 2,"maxPosts": 0,"maxScrolls": 300,"scrollDelayMs": 1000}
Output (per post)
Each dataset item includes:
- identity:
id,postUrl,redirectLink,createdAt - content:
text,lang,source,thumbnail - author:
author.id,author.username, etc. - metrics: replies, reposts, likes, quotes, bookmarks, views
- entities:
hashtags,mentions,links,media - ownership tracing:
requestedProfileUsername,requestedProfileUserId,requestedProfileUrl
A run summary is written to Key-Value Store as OUTPUT.
Ownership safety model
This actor prevents false positives from URL-like username collisions:
- It resolves requested profile identity from X GraphQL responses
- It validates each post against the requested profile identity
- It drops mismatched records automatically
- It emits canonical
redirectLink(x.com/i/status/<id>) so link resolution is stable
Performance strategy
- Captures data from GraphQL timeline responses (not DOM-only scraping)
- Blocks heavy resources (
image,media,font) while crawling - Uses adaptive stop conditions to avoid premature cutoffs
- Supports parallel profile scraping through
profileConcurrency
Local run
npm installAPIFY_LOCAL_STORAGE_DIR=./storage node src/main.js
Deploy to Apify
npm i -g apify-cliapify loginapify push
Repository layout
src/main.js- crawler and extraction logicINPUT_SCHEMA.json- Apify input schema.actor/actor.json- actor metadatapackage.json- runtime dependencies
Notes
- Date filtering is UTC-based:
fromDatestarts at00:00:00.000ZtoDateends at23:59:59.999Z
- Private/protected X profiles cannot be scraped publicly.
