Instagram User Posts Dataset (Full History) cookieless
Pricing
from $1.50 / 1,000 results
Instagram User Posts Dataset (Full History) cookieless
Extract high-fidelity Instagram post metadata with granular engagement metrics. Captures hidden fields, comprehensive timestamps, and precise interaction data for advanced social media performance analysis and strategic content optimization.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer

Surge Street
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Instagram User Posts Dataset (Full History)
Overview
This actor performs a deep extraction of Instagram post data, capturing comprehensive engagement metrics, content metadata, and audience demographics from user profiles. The extraction pipeline ensures data integrity through structured validation and timestamp verification, delivering a complete historical record of post performance suitable for quantitative analysis and machine learning applications.
Data Dictionary
| Field Name | Data Type | Definition |
|---|---|---|
post_id | String | Unique Instagram post identifier assigned by the platform |
external_id | String | Internal tracking identifier for cross-reference and deduplication |
username | String | Instagram handle of the account that published the post |
scraped_at | String (ISO 8601) | UTC timestamp indicating when the data extraction occurred |
post_type | String | Content format classification (e.g., carousel, image, video, reel) |
caption | String | User-generated text content accompanying the post |
language_code | String | ISO 639-1 two-letter language code detected in caption |
is_verified | Boolean | Indicates whether the account has Instagram verification status (blue check) |
media_count | Integer | Number of media items included in the post (relevant for carousels) |
engagement_metrics | Object | Nested object containing quantitative interaction data |
engagement_metrics.likes | Integer | Total number of likes received on the post |
engagement_metrics.comments | Integer | Total number of comments on the post |
engagement_metrics.shares | Integer | Number of times the post was shared via direct message or story |
engagement_metrics.saves | Integer | Number of users who bookmarked the post |
engagement_metrics.reach | Integer | Estimated unique accounts that viewed the post |
location | Object | Geographic metadata associated with the post |
location.id | String | Instagram location identifier |
location.name | String | Human-readable location name |
location.lat | Float | Latitude coordinate in decimal degrees |
location.lng | Float | Longitude coordinate in decimal degrees |
hashtags | Array[String] | List of hashtags extracted from caption (without # symbol) |
mentions | Array[String] | List of @-mentioned usernames in caption |
sentiment_score | Float | Computed sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive) |
accessibility_caption | String | Auto-generated or user-provided alt text describing visual content |
is_sponsored | Boolean | Indicates whether the post is marked as paid partnership content |
content_category | String | Classified content vertical (e.g., Technology, Fashion, Food) |
engagement_rate | Float | Calculated percentage: (likes + comments) / reach × 100 |
audience_demographics | Object | Aggregated viewer demographic data |
audience_demographics.age_range | String | Dominant age bracket of engaged users |
audience_demographics.top_locations | Array[String] | ISO 3166-1 alpha-2 country codes of primary audience |
audience_demographics.gender_split | Object | Percentage distribution of audience by gender |
audience_demographics.gender_split.male | Float | Percentage of male-identified viewers |
audience_demographics.gender_split.female | Float | Percentage of female-identified viewers |
Sample Dataset
Below is a sample of the high-fidelity JSON output:
{"post_id": "987654321098765","external_id": "IGP_2025121945X8B7C","username": "tech_explorer","scraped_at": "2025-12-19T15:30:22Z","post_type": "carousel","caption": "Exploring the future of AI #techinnovation","language_code": "en","is_verified": true,"media_count": 3,"engagement_metrics": {"likes": 7823,"comments": 342,"shares": 156,"saves": 891,"reach": 45678},"location": {"id": "567891234","name": "Silicon Valley","lat": 37.4419,"lng": -122.1419},"hashtags": ["techinnovation", "ai", "future"],"mentions": ["@techweekly", "@airesearch"],"sentiment_score": 0.87,"accessibility_caption": "Person demonstrating new technology in lab setting","is_sponsored": false,"content_category": "Technology","engagement_rate": 4.52,"audience_demographics": {"age_range": "25-34","top_locations": ["US", "UK", "IN"],"gender_split": {"male": 65.4,"female": 34.6}}}
Configuration Parameters
To ensure optimal data depth, configure the following:
| Parameter | Field Name | Data Type | Required | Example | Description |
|---|---|---|---|---|---|
| Username | username | String | Yes | Ronaldo | Instagram username, user ID, or profile URL to extract post history from |
Analytical Use Cases
Researchers and data scientists can leverage this dataset for:
- Sentiment Analysis: Correlate sentiment scores with engagement metrics to identify content tone preferences across audience segments
- Content Performance Modeling: Build predictive models using post_type, hashtags, and timing variables to forecast engagement rates
- Audience Segmentation: Cluster posts by audience_demographics to identify distinct follower cohorts and their content preferences
- Longitudinal Studies: Track engagement_rate trends over time using scraped_at timestamps to measure account growth trajectories
- Network Mapping: Construct mention graphs from the mentions array to visualize influencer collaboration networks
- Geographic Analysis: Map location data to identify regional content performance patterns and optimize posting strategies by geography
- A/B Testing: Compare is_sponsored posts against organic content to quantify paid promotion effectiveness
Technical Limitations
Important Considerations:
- Rate Limiting: Instagram's API enforces request throttling; bulk extractions may require distributed execution across multiple IP addresses
- Data Freshness: Engagement metrics reflect point-in-time snapshots at
scraped_at; real-time updates require continuous polling - Private Accounts: Extraction is limited to public profiles; private accounts return null datasets unless authenticated access is granted
- Historical Depth: Post history retrieval is bounded by Instagram's pagination limits (typically 2,000-5,000 posts per profile)
- Demographic Accuracy:
audience_demographicsrepresents aggregated estimates and may not reflect individual post viewer composition - Deleted Content: Posts removed after initial scraping will not appear in subsequent extractions; maintain versioned datasets for completeness
- Verification Status:
is_verifiedreflects account status at extraction time and may change without historical tracking
Keywords & Tags: This dataset supports workflows involving instagram scraper, instagram data extractor, instagram user posts, extract instagram posts, export instagram posts, instagram lead generation, and social media scraping tool applications for comprehensive social media analytics.