Sora Scraper avatar
Sora Scraper

Pricing

Pay per event

Go to Apify Store
Sora Scraper

Sora Scraper

Developed by

Lexis Solutions

Lexis Solutions

Maintained by Community

Discover AI-generated video insights from OpenAI’s Sora 2 community—extract posts, media, user profiles, comments, and engagement metrics. Perfect for trend analysis, content curation, influencer tracking, and research. Fast, reliable, and fully customizable.

5.0 (6)

Pricing

Pay per event

10

43

30

Last modified

18 days ago

Sora Scraper

Sora Scraper

The Sora Scraper is an Apify actor for OpenAI's Sora 2 — the next generation of AI video creation platform. Extract posts, videos, engagement metrics, user profiles, and comments from Sora's community with unprecedented ease.


✨ Key Features

  • 🚀 First-to-Market: The only scraper built specifically for Sora 2.
  • 🎬 Media Downloads: Download videos, thumbnails, and GIFs directly from posts.
  • 💬 Comment Extraction: Capture detailed comment threads with engagement data.
  • 👤 Rich Profile Data: Extract complete user profiles including followers, verified status, and more.
  • 📊 Engagement Metrics: Views, unique views, likes, remixes, replies, and recursive replies.
  • 🔍 Flexible Search: Query any topic and discover Sora's creative community.
  • ⚙️ Granular Control: Configure comment limits, media downloads, and result counts.
  • 📦 Structured Output: Normalized JSON data ready for analysis and integration.

💡 Why It's Important

Sora represents OpenAI's breakthrough in AI-generated video content, and Sora 2 is the latest evolution. With this scraper, you can:

  • Monitor trending content in the AI video generation space.
  • Analyze user engagement patterns and viral content characteristics.
  • Archive creative works for research, inspiration, or competitive analysis.
  • Track community growth and user interactions in real-time.
  • Build datasets for AI content analysis, sentiment studies, and trend forecasting.

👤 Who Is It For?

  • AI Researchers studying generative video trends and user behavior.
  • Content Creators seeking inspiration and understanding what resonates.
  • Marketing Agencies monitoring brand mentions and creative trends.
  • Data Scientists building datasets for machine learning and analytics.
  • Media Companies tracking viral content and emerging creators.
  • Developers building applications around AI-generated content.

🚀 Business Use Cases

  • Trend Analysis: Identify viral prompts, themes, and creative patterns.
  • Content Curation: Aggregate and showcase top Sora creations.
  • Competitive Intelligence: Track how competitors use AI video generation.
  • Influencer Discovery: Find trending creators and their engagement rates.
  • Brand Monitoring: Track mentions and sentiment around your brand.
  • Research & Development: Build datasets for AI content analysis.
  • Market Research: Understand user preferences in AI-generated content.

🛠 Input Schema

The actor accepts the following input:

{
"query": "SpongeBob",
"numOfComments": 10,
"downloadVideo": false,
"downloadThumbnail": false,
"downloadGIF": false,
"maxItems": 10,
"proxyConfiguration": {
"useApifyProxy": false
}
}

Input Parameters

ParameterTypeRequiredDescription
querystringYesSearch query to find Sora posts (e.g., "SpongeBob", "sunset timelapse")
numOfCommentsintegerNoMaximum number of comments to extract per post (default: 10)
downloadVideobooleanNoDownload video files to key-value store. Key stored in videoStoreKey (default: false)
downloadThumbnailbooleanNoDownload thumbnail images to key-value store. Key stored in thumbnailStoreKey (default: false)
downloadGIFbooleanNoDownload GIF previews to key-value store. Key stored in gifStoreKey (default: false)
maxItemsintegerNoMaximum number of posts to scrape (default: 10)
proxyConfigurationobjectNoApify proxy configuration for requests

Notes:

  • Required field: query is mandatory.
  • Media Downloads: Enabling video, thumbnail, or GIF downloads will increase run time but provide direct access to media files in the key-value store.
  • Comments: Set numOfComments to control how many comments are extracted per post. Comments include full profile data and engagement metrics.
  • Performance: Higher maxItems and media downloads will consume more resources and time.

📦 Output Schema

Each dataset item contains comprehensive post data:

{
"id": "s_68dca5d7d4ac8191987e9c6393d498d4",
"text": "spongebob as a ww2 leader speaking about the scourge of fish ruining bikini bottom wearing axis power uniform",
"caption": null,
"link": "https://sora.chatgpt.com/p/s_68dca5d7d4ac8191987e9c6393d498d4",
"coverUrl": "https://videos.openai.com/vg-assets/...",
"gifUrl": "https://videos.openai.com/vg-assets/...",
"postedAt": 1759290839.830908,
"updatedAt": 1759936985.530838,
"likes": 1289,
"replies": 43,
"views": 39477,
"uniqueViews": 24713,
"remixes": 76,
"recursiveReplies": 70,
"dislikeCount": 0,
"workspaceId": null,
"postedToPublic": true,
"emoji": "🧽",
"attachments": [
{
"id": "s_68dca5d7d4ac8191987e9c6393d498d4-attachment-0",
"title": "New Video",
"url": "https://sdmntprsouthcentralus.oaiusercontent.com/files/...",
"downloadableUrl": "https://sdmntprsouthcentralus.oaiusercontent.com/files/...",
"thumbnail": "https://videos.openai.com/vg-assets/...",
"gif": "https://videos.openai.com/vg-assets/...",
"width": 352,
"height": 640,
"generationId": "gen_01k6eyadhqezmskzd31pp2n2xm",
"generationType": "video_gen"
}
],
"profile": {
"id": "user-vmw00GfT7mSYdcIST7bLbwCF",
"username": "jakeleventhal",
"displayName": "Jake Leventhal",
"profilePictureUrl": "https://sdmntprnorthcentralus.oaiusercontent.com/files/...",
"coverPhotoUrl": null,
"link": "https://sora.chatgpt.com/profile/jakeleventhal",
"verified": false,
"followerCount": 2664,
"followingCount": 7,
"postCount": 61,
"replyCount": 0,
"likesReceivedCount": 22854,
"remixCount": 1606,
"cameoCount": 33,
"isBlocked": false,
"followedBy": [],
"planType": null,
"createdAt": 1753852741.285583,
"updatedAt": 1759951105.520806,
"bannedAt": null,
"calpicoIsEnabled": true,
"soraWhoCanMessageMe": "followees_only",
"isPublicFigure": false,
"location": null,
"description": null,
"birthday": null,
"website": null
},
"videoStoreKey": "s_68dca5d7d4ac8191987e9c6393d498d4_video_0.mp4",
"thumbnailStoreKey": "s_68dca5d7d4ac8191987e9c6393d498d4_thumbnail_0.webp",
"gifStoreKey": "s_68dca5d7d4ac8191987e9c6393d498d4_gif_0.gif",
"comments": [
{
"id": "68dcb87374948191bc6c9f88b5ea723e",
"text": "Ts gonna be the reason Viacom gonna shut this down😭😭",
"caption": null,
"postedAt": 1759295603.455459,
"updatedAt": 1759530609.903473,
"likes": 16,
"parentPostId": "s_68dca5d7d4ac8191987e9c6393d498d4",
"rootPostId": "s_68dca5d7d4ac8191987e9c6393d498d4",
"postUrl": "https://sora.chatgpt.com/p/s_68dca5d7d4ac8191987e9c6393d498d4",
"profile": {
"id": "user-PDq6JrFlZ0qjFVKrdeAmiTnh",
"username": "skipppz",
"displayName": "C",
"profilePictureUrl": "https://cdn.openai.com/sora/images/profile_placeholder_v4.png",
"verified": false,
"followerCount": 1,
"followingCount": 2,
"postCount": 9,
"replyCount": 9,
"likesReceivedCount": 98,
"remixCount": 2,
"cameoCount": 0
}
}
]
}

Output Fields Explained

Post Data

  • id: Unique post identifier
  • text: The prompt/description used to generate the video
  • caption: Optional caption text
  • link: Direct link to the post on Sora
  • coverUrl: URL to the cover image
  • gifUrl: URL to the animated GIF preview
  • emoji: Associated emoji for the post

Engagement Metrics

  • likes: Number of likes
  • replies: Direct reply count
  • views: Total view count
  • uniqueViews: Unique viewer count
  • remixes: Number of times the video was remixed
  • recursiveReplies: Total replies including nested threads
  • dislikeCount: Number of dislikes

Timestamps

  • postedAt: Unix timestamp when post was created
  • updatedAt: Unix timestamp of last update

Attachments

  • id: Attachment identifier
  • title: Attachment title
  • url: Direct video URL
  • downloadableUrl: URL for downloading
  • thumbnail: Thumbnail image URL
  • gif: GIF preview URL
  • width / height: Video dimensions
  • generationId: Sora generation ID
  • generationType: Type of generation (e.g., "video_gen")

Profile Data

Complete user profile including:

  • Username, display name, profile picture
  • Verification status
  • Follower/following counts
  • Post and reply counts
  • Likes received, remix count, cameo count
  • Account creation and update timestamps
  • Privacy settings and location

Downloaded Media Keys

  • videoStoreKey: Key-value store key for downloaded video (provided only when downloadVideo is enabled)
  • thumbnailStoreKey: Key-value store key for downloaded thumbnail (provided only when downloadThumbnail is enabled)
  • gifStoreKey: Key-value store key for downloaded GIF (provided only when downloadGIF is enabled)

Comments

Array of comment objects with:

  • Comment text and timestamps
  • Like counts
  • Parent and root post IDs
  • Full profile data for commenter
  • Post URL for context

🎯 Advanced Features

Media Download System

When you enable media downloads (downloadVideo, downloadThumbnail, or downloadGIF), files are automatically saved to Apify's key-value store with predictable keys:

  • Videos: {postId}_video_{index}.mp4
  • Thumbnails: {postId}_thumbnail_{index}.webp
  • GIFs: {postId}_gif_{index}.gif

Access downloaded files programmatically or through the Apify console's key-value store tab.

Comment Threading

Comments maintain parent-child relationships through parentPostId and rootPostId fields, allowing you to reconstruct conversation threads. Each comment includes:

  • Full commenter profile
  • Engagement metrics (likes)
  • Timestamps for tracking conversation flow

Engagement Analytics

Track multiple engagement dimensions:

  • Virality: views and uniqueViews show reach
  • Interaction: likes, replies, and recursiveReplies measure engagement depth
  • Creativity: remixes show how content inspires others
  • Trend tracking: Compare metrics across posts to identify patterns

🔧 Best Practices

  1. Start Small: Test with maxItems: 10 to understand output structure before scaling.
  2. Media Downloads: Only enable media downloads when necessary — they significantly increase run time.
  3. Comment Limits: Adjust numOfComments based on your needs. High-engagement posts can have hundreds of comments.
  4. Proxy Configuration: Use Apify proxies for reliable access and to respect rate limits.

🌟 Why Choose Our Sora Scraper?

First to Market — The only Sora 2 scraper available
Comprehensive Data — Posts, profiles, comments, engagement metrics
Media Support — Download videos, thumbnails, and GIFs
Production Ready — Structured output, error handling, proxy support
Well Maintained — Regular updates as Sora evolves
Expert Support — Backed by certified Apify Partners


👀 p.s.

Got feedback or need an extension?

Lexis Solutions is a certified Apify Partner. We can help you with custom solutions or data extraction projects.

Contact us over Email or LinkedIn

Support Our Work 💝

If you're happy with our work and scrapers, you're welcome to leave us a company review here and leave a review for the scrapers you're subscribed to. It will take you less than a minute but it will mean a lot to us!

Image Credit: https://sora.chatgpt.com/