Instagram Scraper
Pricing
Pay per usage
Instagram Scraper
Fast, reliable, and production-ready Instagram Scraper. Extract public profile metadata (followers, following, bio, verified status, profile pic) and recent posts (captions, likes, comments, media URLs, timestamps) without login. Supports residential proxy rotation and is fully optimized.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Pratyush Kumar
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 days ago
Last modified
Categories
Share
🚀 High-Performance Instagram Scraper
A lightweight, reliable, and blazingly fast Instagram Scraper built with TypeScript, Crawlee, and Apify SDK. It extracts Instagram profile details and recent posts without requiring you to log in, bypassing Instagram's aggressive login walls and scraping limitations.
⚡ Key Features
- Hybrid Crawling Engine (Fast Path + Browser Fallback):
- Phase 1 (Fast Path): Attempts to scrape profile information directly via Instagram's JSON API using optimized HTTP requests (
gotScraping). This consumes almost zero CPU/memory and executes in under 1 second per profile. - Phase 2 (Browser Fallback): If the API blocks the request, the scraper automatically spins up a headless Chromium browser page via Playwright, navigates to the profile page, accepts cookie prompts, and intercepts the network requests to retrieve the profile data.
- Phase 1 (Fast Path): Attempts to scrape profile information directly via Instagram's JSON API using optimized HTTP requests (
- Proxy Grouping & Rotation: Seamlessly integrates with Apify Proxy (Residential Proxies recommended) to prevent rate limits and blocks.
- Post Limitation: Control how many posts you want to extract per profile to save bandwidth and runtime.
- No Login Required: Avoid the risk of getting your own accounts flagged or banned.
🛠 How It Works
graph TDA[Start Scraper] --> B[Parse & Clean Usernames]B --> C[Phase 1: Fast HTTP API Request]C -->|Success| D[Save Data & Mark Done]C -->|Blocked/Failed| E[Phase 2: Fallback Playwright Browser]E -->|Navigate & Intercept API| F{Successfully Intercepted?}F -->|Yes| DF -->|No| G[Direct API call inside Browser Session]G -->|Success| DG -->|Failed| H[Retry with next Proxy / Fail]D --> I[Export JSON Result]
📥 Input Configuration
The scraper accepts the following input parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
usernames | array | Yes | — | A list of Instagram usernames to scrape (e.g. ["instagram", "nasa"]). Supports leading @ and full profile URLs. |
maxPosts | integer | No | 12 | The maximum number of recent posts to extract from each profile. |
proxyConfiguration | object | No | { "useApifyProxy": true } | Proxy configuration options. It is highly recommended to use Residential Proxies to prevent login walls. |
Input Example (JSON)
{"usernames": ["instagram","@nasa","https://www.instagram.com/nationalgeographic/"],"maxPosts": 5,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
📤 Output Format
Data is exported to the default dataset directory. Each item contains both the profile's metadata and an array of recent posts.
Output Example (JSON)
{"profile": {"id": "25025320","username": "instagram","fullName": "Instagram","biography": "Bring you closer to the people and things you love. ❤️","externalUrl": "https://linkin.bio/instagram","followersCount": 673920182,"followingCount": 84,"postsCount": 7450,"profilePicUrl": "https://instagram.famd1-1.fna.fbcdn.net/v/t51.2885-19/...","isVerified": true,"isPrivate": false,"scrapedAt": "2026-06-06T17:10:00.000Z"},"posts": [{"id": "3128954703928120348","shortcode": "C71abcXYZ","url": "https://www.instagram.com/p/C71abcXYZ/","type": "GraphImage","caption": "Exploring the wonders of the digital world. #instagram #community","displayUrl": "https://instagram.famd1-1.fna.fbcdn.net/v/t51.2885-15/...","videoUrl": null,"likesCount": 142805,"commentsCount": 1250,"timestamp": "2026-06-05T18:00:00.000Z","dimensions": {"height": 1080,"width": 1080},"location": "San Francisco, California"}]}
🚀 Local Development
Follow these steps to run the scraper on your own computer:
Prerequisites
- Node.js (v18 or v20 recommended)
- NPM
Setup Instructions
-
Clone the repository and install dependencies:
$npm install -
Configure Local Input: Create a folder named
storage/key_value_stores/defaultin the root of the project if it doesn't exist. Then create a file namedINPUT.jsoninside it:{"usernames": ["instagram"],"maxPosts": 3} -
Build the TypeScript files:
$npm run build -
Run the actor locally:
$npm startThe results will be written to
storage/datasets/default/.
💡 Performance Optimization & Tips
[!TIP] Use Residential Proxies: Instagram uses sophisticated anti-bot filters. Regular datacenter IPs will trigger login walls almost immediately. For industrial-scale scraping, always enable residential proxies.
Keep maxPosts Moderate: Requesting a high number of posts requires more data parsing and memory. Set
maxPoststo what is absolutely necessary for your business needs.