Instagram User Posts Dataset (Full History) cookieless avatar
Instagram User Posts Dataset (Full History) cookieless
Under maintenance

Pricing

from $1.50 / 1,000 results

Go to Apify Store
Instagram User Posts Dataset (Full History) cookieless

Instagram User Posts Dataset (Full History) cookieless

Under maintenance

Extract high-fidelity Instagram post metadata with granular engagement metrics. Captures hidden fields, comprehensive timestamps, and precise interaction data for advanced social media performance analysis and strategic content optimization.

Pricing

from $1.50 / 1,000 results

Rating

0.0

(0)

Developer

Surge Street

Surge Street

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Instagram User Posts Dataset (Full History)

Overview

This actor performs a deep extraction of Instagram post data, capturing comprehensive engagement metrics, content metadata, and audience demographics from user profiles. The extraction pipeline ensures data integrity through structured validation and timestamp verification, delivering a complete historical record of post performance suitable for quantitative analysis and machine learning applications.

Data Dictionary

Field NameData TypeDefinition
post_idStringUnique Instagram post identifier assigned by the platform
external_idStringInternal tracking identifier for cross-reference and deduplication
usernameStringInstagram handle of the account that published the post
scraped_atString (ISO 8601)UTC timestamp indicating when the data extraction occurred
post_typeStringContent format classification (e.g., carousel, image, video, reel)
captionStringUser-generated text content accompanying the post
language_codeStringISO 639-1 two-letter language code detected in caption
is_verifiedBooleanIndicates whether the account has Instagram verification status (blue check)
media_countIntegerNumber of media items included in the post (relevant for carousels)
engagement_metricsObjectNested object containing quantitative interaction data
engagement_metrics.likesIntegerTotal number of likes received on the post
engagement_metrics.commentsIntegerTotal number of comments on the post
engagement_metrics.sharesIntegerNumber of times the post was shared via direct message or story
engagement_metrics.savesIntegerNumber of users who bookmarked the post
engagement_metrics.reachIntegerEstimated unique accounts that viewed the post
locationObjectGeographic metadata associated with the post
location.idStringInstagram location identifier
location.nameStringHuman-readable location name
location.latFloatLatitude coordinate in decimal degrees
location.lngFloatLongitude coordinate in decimal degrees
hashtagsArray[String]List of hashtags extracted from caption (without # symbol)
mentionsArray[String]List of @-mentioned usernames in caption
sentiment_scoreFloatComputed sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive)
accessibility_captionStringAuto-generated or user-provided alt text describing visual content
is_sponsoredBooleanIndicates whether the post is marked as paid partnership content
content_categoryStringClassified content vertical (e.g., Technology, Fashion, Food)
engagement_rateFloatCalculated percentage: (likes + comments) / reach × 100
audience_demographicsObjectAggregated viewer demographic data
audience_demographics.age_rangeStringDominant age bracket of engaged users
audience_demographics.top_locationsArray[String]ISO 3166-1 alpha-2 country codes of primary audience
audience_demographics.gender_splitObjectPercentage distribution of audience by gender
audience_demographics.gender_split.maleFloatPercentage of male-identified viewers
audience_demographics.gender_split.femaleFloatPercentage of female-identified viewers

Sample Dataset

Below is a sample of the high-fidelity JSON output:

{
"post_id": "987654321098765",
"external_id": "IGP_2025121945X8B7C",
"username": "tech_explorer",
"scraped_at": "2025-12-19T15:30:22Z",
"post_type": "carousel",
"caption": "Exploring the future of AI #techinnovation",
"language_code": "en",
"is_verified": true,
"media_count": 3,
"engagement_metrics": {
"likes": 7823,
"comments": 342,
"shares": 156,
"saves": 891,
"reach": 45678
},
"location": {
"id": "567891234",
"name": "Silicon Valley",
"lat": 37.4419,
"lng": -122.1419
},
"hashtags": ["techinnovation", "ai", "future"],
"mentions": ["@techweekly", "@airesearch"],
"sentiment_score": 0.87,
"accessibility_caption": "Person demonstrating new technology in lab setting",
"is_sponsored": false,
"content_category": "Technology",
"engagement_rate": 4.52,
"audience_demographics": {
"age_range": "25-34",
"top_locations": ["US", "UK", "IN"],
"gender_split": {
"male": 65.4,
"female": 34.6
}
}
}

Configuration Parameters

To ensure optimal data depth, configure the following:

ParameterField NameData TypeRequiredExampleDescription
UsernameusernameStringYesRonaldoInstagram username, user ID, or profile URL to extract post history from

Analytical Use Cases

Researchers and data scientists can leverage this dataset for:

  • Sentiment Analysis: Correlate sentiment scores with engagement metrics to identify content tone preferences across audience segments
  • Content Performance Modeling: Build predictive models using post_type, hashtags, and timing variables to forecast engagement rates
  • Audience Segmentation: Cluster posts by audience_demographics to identify distinct follower cohorts and their content preferences
  • Longitudinal Studies: Track engagement_rate trends over time using scraped_at timestamps to measure account growth trajectories
  • Network Mapping: Construct mention graphs from the mentions array to visualize influencer collaboration networks
  • Geographic Analysis: Map location data to identify regional content performance patterns and optimize posting strategies by geography
  • A/B Testing: Compare is_sponsored posts against organic content to quantify paid promotion effectiveness

Technical Limitations

Important Considerations:

  • Rate Limiting: Instagram's API enforces request throttling; bulk extractions may require distributed execution across multiple IP addresses
  • Data Freshness: Engagement metrics reflect point-in-time snapshots at scraped_at; real-time updates require continuous polling
  • Private Accounts: Extraction is limited to public profiles; private accounts return null datasets unless authenticated access is granted
  • Historical Depth: Post history retrieval is bounded by Instagram's pagination limits (typically 2,000-5,000 posts per profile)
  • Demographic Accuracy: audience_demographics represents aggregated estimates and may not reflect individual post viewer composition
  • Deleted Content: Posts removed after initial scraping will not appear in subsequent extractions; maintain versioned datasets for completeness
  • Verification Status: is_verified reflects account status at extraction time and may change without historical tracking

Keywords & Tags: This dataset supports workflows involving instagram scraper, instagram data extractor, instagram user posts, extract instagram posts, export instagram posts, instagram lead generation, and social media scraping tool applications for comprehensive social media analytics.