Facebook Groups Scraper
Pricing
$19.99/month + usage
Facebook Groups Scraper
Extract data from public Facebook groups including posts, comments, reactions, and member insights. This Apify scraper helps you track discussions, analyze engagement, monitor trends, and gather valuable data for research, marketing, and community intelligence
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapeEngine
Actor stats
1
Bookmarked
9
Total users
2
Monthly active users
4 days ago
Last modified
Categories
Share
Facebook Groups Scraper
Facebook Groups Scraper is an Apify actor that extracts structured data from public Facebook groups — including posts, comments, reactions, attachments, and group metadata — to power social listening, engagement analysis, and trend monitoring at scale. It solves the challenge of collecting reliable, normalized group data by automatically discovering required GraphQL parameters, paginating feeds, and streaming results to your dataset in real time. Built for marketers, developers, data analysts, and researchers, this production-ready scraper enables repeatable workflows and robust data pipelines.
What data / output can you get?
| Data field | Description | Example value |
|---|---|---|
| facebookUrl | URL of the source Facebook group | https://www.facebook.com/groups/germtheory.vs.terraintheory |
| url | Direct permalink to the post | https://www.facebook.com/groups/123456789/permalink/987654321 |
| time | Post creation timestamp (ISO8601, UTC) | 2024-03-15T10:30:00.000Z |
| id | Unique post identifier | 123456789012345 |
| legacyId | Legacy post ID (post_id) | 987654321098765 |
| feedbackId | Feedback identifier for the post | 1122334455667788 |
| user.id | Author’s user ID | 10001234567890 |
| user.name | Author’s display name | Jane Doe |
| text | Post text content | Looking for recommendations on analytics tools… |
| attachments[] | Array of attachments (photos, media sets) with thumbnails and dimensions | { "thumbnail": "https://...", "image": { "uri": "https://...", "height": 1080, "width": 1080 } } |
| likesCount | Total number of reactions (summary) | 142 |
| sharesCount | Share count | 7 |
| commentsCount | Total number of comments | 56 |
| topReactionsCount | Sum of top reactions | 140 |
| reactionLikeCount | Number of “Like” reactions | 120 |
| reactionLoveCount | Number of “Love” reactions | 20 |
| topComments[0].commentUrl | Direct URL to top comment | https://www.facebook.com/groups/123/permalink/456/?comment_id=789 |
| topComments[0].profileName | Commenter’s name | John Smith |
| facebookId | Group ID associated with the post | 123456789012345 |
| groupTitle | Group title/name | Example Community Group |
| pageAdLibrary.id | Group/page identifier inside pageAdLibrary | 123456789012345 |
| inputUrl | Original input group URL | https://www.facebook.com/groups/germtheory.vs.terraintheory |
Notes:
- Attachments can include photos with thumbnails and dimensions, media sets (albums) with mediaset_token, OCR text from images, and owner info where available.
- Data is streamed to the Apify dataset during the run. You can export results in JSON, CSV, or Excel formats from the dataset.
Key features
-
🧠 Automatic GraphQL discovery
Automatically extracts node_id, doc_id, and end_cursor from group HTML/JavaScript for resilient operation across Facebook UI changes. -
🔁 Cursor-based pagination
Smartly paginates through group feeds and streams normalized posts to the dataset in real time. -
🏠 Always-on residential proxy
Forces Apify Residential proxy usage and retries until your data demand is fulfilled, improving reliability and reducing blocks. -
🔄 Flexible post sorting
Supports CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, and CHRONOLOGICAL_LISTINGS via the viewOption input. -
🔎 Advanced filtering controls
Client-side filters for searchGroupKeyword, searchGroupYear, and onlyPostsNewerThan to refine results quickly. -
⚡ Anti-blocking delays
Adds randomized delays and exponential backoff to reduce detection during scraping. -
🧩 Rich, normalized output
Extracts post identifiers, author info, text, reactions (including reactionLikeCount and reactionLoveCount), comments (topComments), attachments, and group-level metadata. -
🛠️ Fallback doc_id support
Configure fallbackDocId if Facebook’s frontend changes and automatic discovery temporarily fails.
How to use Facebook Groups Scraper - step by step
- Create or sign in to your Apify account.
- Open the Facebook Groups Scraper actor in the Apify Console.
- Add input:
- Paste one or more public group URLs into startUrls (string list).
- Optionally set resultsLimit (default 20) to control how many posts you want.
- Choose a viewOption: CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, or CHRONOLOGICAL_LISTINGS.
- Use searchGroupKeyword and searchGroupYear to narrow by keyword/letter and year.
- Set onlyPostsNewerThan to restrict results to a specific date or relative time window.
- (Advanced) If you see a “Missing doc_id” error, set fallbackDocId to a known working GraphQL doc_id.
- Start the run and monitor logs:
- The actor will automatically use Apify Residential proxy and retry until your target number of posts is reached or retries are exhausted.
- View results:
- Go to the Dataset tab to see posts as they stream in.
- Export your data as JSON, CSV, or Excel for downstream analysis.
- Pro Tip: Combine onlyPostsNewerThan with a one- or two-letter searchGroupKeyword for focused, high-signal datasets.
Use cases
| Use case | Description |
|---|---|
| Social listening for marketing teams | Track discussions and engagement trends in public groups to inform campaigns and messaging. |
| Competitive intelligence | Monitor top posts and recent activity across niche communities to benchmark content performance. |
| Academic & policy research | Collect structured, time-bounded datasets for longitudinal studies and qualitative analysis. |
| Community management analytics | Measure reactions, comments, and share counts to understand community health and interests. |
| Lead generation in buy/sell verticals | Use CHRONOLOGICAL_LISTINGS to scan marketplace-like groups for timely opportunities. |
| Content strategy & benchmarking | Identify top-performing posts (TOP_POSTS) to inspire your editorial calendar. |
| Data enrichment pipelines | Export normalized JSON/CSV from the dataset for integration with BI tools or internal dashboards. |
Why choose Facebook Groups Scraper?
This actor is built for precision, automation, and resilient data collection from public Facebook groups.
- ✅ Accurate parameter discovery: Automatically finds node_id, doc_id, and end_cursor for stable scraping.
- 🌐 Residential proxy reliability: Always uses Apify Residential proxy with automatic retries to meet your resultsLimit.
- 🔎 Powerful filtering: Use keyword/letter searches, year filters, and date thresholds to target what matters.
- ⚙️ Developer-friendly output: Clean, consistent JSON structure with normalized post, reaction, comment, and group fields.
- 📦 Real-time streaming: Saves items to the dataset during the run for immediate access and incremental processing.
- 🛡️ Safer than ad-hoc tools: Avoid brittle browser extensions and manual copy-paste with a production-grade Apify actor.
- 📈 Export-ready: Download as JSON, CSV, or Excel from the dataset for instant analysis and reporting.
In short, it’s a dependable Facebook group data extractor — not a fragile workaround.
Is it legal / ethical to use Facebook Groups Scraper?
Yes, when done responsibly. This actor is designed for extracting publicly available data from public Facebook groups. It does not access authenticated or private group content.
Guidelines for responsible use:
- Collect only publicly available group data.
- Respect platform terms and applicable data protection laws (e.g., GDPR, CCPA).
- Avoid using the data for spam or abusive outreach.
- Consult your legal team for edge cases or jurisdiction-specific requirements.
Input parameters & output format
Example JSON input
{"startUrls": ["https://www.facebook.com/groups/germtheory.vs.terraintheory"],"resultsLimit": 100,"viewOption": "RECENT_ACTIVITY","searchGroupKeyword": "a","searchGroupYear": "2024","onlyPostsNewerThan": "2 months","fallbackDocId": "","proxyConfiguration": {"useApifyProxy": false}}
Input parameters
-
startUrls (array, required)
- Description: Add one or more public Facebook group URLs. Only public groups are supported.
- Default: none
- Required: Yes
-
resultsLimit (integer)
- Description: Maximum number of posts to scrape from each group.
- Default: 20 (minimum: 1)
- Required: No
-
viewOption (string)
- Description: Post sorting strategy. Options: CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, CHRONOLOGICAL_LISTINGS. Note: Post limit applies to 'New posts' sorting only.
- Default: CHRONOLOGICAL
- Required: No
-
searchGroupKeyword (string)
- Description: Search posts by keyword or letter. Without login, search is limited; 1–2 letters recommended.
- Default: "" (empty string)
- Required: No
-
searchGroupYear (string)
- Description: Filter posts from a specific year (e.g., 2024). Best used together with the Search Keyword field.
- Default: "" (empty string)
- Required: No
-
onlyPostsNewerThan (string)
- Description: Stop scraping when posts are older than this date. Supports absolute (YYYY-MM-DD) or relative (“2 weeks”, “7 days”, “1 month”).
- Default: "" (empty string)
- Required: No
-
fallbackDocId (string)
- Description: Fallback GraphQL doc_id if automatic discovery fails (e.g., after a Facebook frontend update).
- Default: "" (empty string)
- Required: No
-
proxyConfiguration (object)
- Description: Proxy settings editor. Note: The actor always uses Apify Residential proxy internally and retries until data demand is fulfilled.
- Default: { "useApifyProxy": false }
- Required: No
Example JSON output (one post)
{"facebookUrl": "https://www.facebook.com/groups/germtheory.vs.terraintheory","url": "https://www.facebook.com/groups/123456789/permalink/987654321","time": "2024-03-15T10:30:00.000Z","user": {"id": "10001234567890","name": "Jane Doe"},"text": "Looking for recommendations on analytics tools…","topReactionsCount": 140,"feedbackId": "1122334455667788","reactionLikeCount": 120,"reactionLoveCount": 20,"id": "123456789012345","legacyId": "987654321098765","attachments": [{"thumbnail": "https://example-cdn.com/thumb.jpg","__typename": "Photo","is_playable": false,"image": {"uri": "https://example-cdn.com/image.jpg","height": 1080,"width": 1080},"id": "445566778899","__isMedia": "Photo","photo_cix_screen": null,"copyright_banner_info": null,"owner": {"__typename": "User","id": "10001234567890"},"ocrText": "Sample caption text"}],"likesCount": 142,"sharesCount": 7,"commentsCount": 56,"topComments": [{"commentUrl": "https://www.facebook.com/groups/123456789/permalink/987654321/?comment_id=555444333","id": "555444333","feedbackId": "9988776655","date": "2024-03-15T11:00:00.000Z","text": "We use Tool X and love it!","profileUrl": "https://www.facebook.com/john.smith","profilePicture": "https://example-cdn.com/profile.jpg","profileId": "1000234567890","profileName": "John Smith","likesCount": "12","threadingDepth": 0}],"facebookId": "123456789012345","groupTitle": "Example Community Group","pageAdLibrary": {"is_business_page_active": false,"id": "123456789012345"},"inputUrl": "https://www.facebook.com/groups/germtheory.vs.terraintheory"}
Notes:
- Some fields may be null or empty if not present in the source (e.g., time when timestamp is unavailable, attachments when none exist, groupTitle for certain posts).
FAQ
Does this work on private groups?
No. The actor is designed for public Facebook groups. Private groups require authentication, and the input explicitly targets public group URLs.
Do I need to log in or add cookies?
No login is required for scraping public groups. However, search is limited without login; for searchGroupKeyword, using one or two letters often yields better results. Full-word searches may return nothing in most cases.
What sorting modes are supported?
The viewOption input supports CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, and CHRONOLOGICAL_LISTINGS. This lets you collect posts by latest activity, top-performing content, or marketplace-like chronological listings.
How many posts can I collect per run?
Use resultsLimit to set your target. The actor paginates through the group feed and will retry with residential proxies until your data demand is fulfilled or retry limits are reached. The default is 20.
What data fields are included in the output?
Each post includes facebookUrl, url, time, id, legacyId, feedbackId, user (id, name), text, attachments, likesCount, sharesCount, commentsCount, topReactionsCount, reactionLikeCount, reactionLoveCount, topComments (with commenter details), facebookId, groupTitle, pageAdLibrary, and inputUrl.
How does the actor handle proxies?
It always uses Apify Residential proxy, obtained programmatically at runtime, and applies retry logic and randomized delays to reduce blocking and improve completion rates.
What if the actor reports “Missing doc_id”?
Set fallbackDocId in the input. This provides a known working GraphQL doc_id when automatic discovery fails (e.g., after a Facebook frontend update).
Is there a free trial or pricing?
Yes. The listing includes a flat monthly price of $19.99 and provides 120 trial minutes to evaluate the actor before committing to full usage.
Closing CTA / Final thoughts
Facebook Groups Scraper is built to turn public Facebook group activity into clean, structured datasets you can trust. With automatic discovery of required GraphQL parameters, reliable residential proxy usage, flexible sorting/filters, and real-time dataset streaming, it’s ideal for marketers, analysts, developers, and researchers. Export results as JSON/CSV for analytics, enrichment, or workflow automation — and use fallbackDocId for resilience when Facebook changes its frontend. Start extracting smarter community insights today with a production-ready pipeline.