Facebook Groups Scraper avatar

Facebook Groups Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Facebook Groups Scraper

Facebook Groups Scraper

Extract data from public Facebook groups including posts, comments, reactions, and member insights. This Apify scraper helps you track discussions, analyze engagement, monitor trends, and gather valuable data for research, marketing, and community intelligence

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapeEngine

ScrapeEngine

Maintained by Community

Actor stats

1

Bookmarked

9

Total users

2

Monthly active users

4 days ago

Last modified

Share

Facebook Groups Scraper

Facebook Groups Scraper is an Apify actor that extracts structured data from public Facebook groups — including posts, comments, reactions, attachments, and group metadata — to power social listening, engagement analysis, and trend monitoring at scale. It solves the challenge of collecting reliable, normalized group data by automatically discovering required GraphQL parameters, paginating feeds, and streaming results to your dataset in real time. Built for marketers, developers, data analysts, and researchers, this production-ready scraper enables repeatable workflows and robust data pipelines.

What data / output can you get?

Data fieldDescriptionExample value
facebookUrlURL of the source Facebook grouphttps://www.facebook.com/groups/germtheory.vs.terraintheory
urlDirect permalink to the posthttps://www.facebook.com/groups/123456789/permalink/987654321
timePost creation timestamp (ISO8601, UTC)2024-03-15T10:30:00.000Z
idUnique post identifier123456789012345
legacyIdLegacy post ID (post_id)987654321098765
feedbackIdFeedback identifier for the post1122334455667788
user.idAuthor’s user ID10001234567890
user.nameAuthor’s display nameJane Doe
textPost text contentLooking for recommendations on analytics tools…
attachments[]Array of attachments (photos, media sets) with thumbnails and dimensions{ "thumbnail": "https://...", "image": { "uri": "https://...", "height": 1080, "width": 1080 } }
likesCountTotal number of reactions (summary)142
sharesCountShare count7
commentsCountTotal number of comments56
topReactionsCountSum of top reactions140
reactionLikeCountNumber of “Like” reactions120
reactionLoveCountNumber of “Love” reactions20
topComments[0].commentUrlDirect URL to top commenthttps://www.facebook.com/groups/123/permalink/456/?comment_id=789
topComments[0].profileNameCommenter’s nameJohn Smith
facebookIdGroup ID associated with the post123456789012345
groupTitleGroup title/nameExample Community Group
pageAdLibrary.idGroup/page identifier inside pageAdLibrary123456789012345
inputUrlOriginal input group URLhttps://www.facebook.com/groups/germtheory.vs.terraintheory

Notes:

  • Attachments can include photos with thumbnails and dimensions, media sets (albums) with mediaset_token, OCR text from images, and owner info where available.
  • Data is streamed to the Apify dataset during the run. You can export results in JSON, CSV, or Excel formats from the dataset.

Key features

  • 🧠 Automatic GraphQL discovery
    Automatically extracts node_id, doc_id, and end_cursor from group HTML/JavaScript for resilient operation across Facebook UI changes.

  • 🔁 Cursor-based pagination
    Smartly paginates through group feeds and streams normalized posts to the dataset in real time.

  • 🏠 Always-on residential proxy
    Forces Apify Residential proxy usage and retries until your data demand is fulfilled, improving reliability and reducing blocks.

  • 🔄 Flexible post sorting
    Supports CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, and CHRONOLOGICAL_LISTINGS via the viewOption input.

  • 🔎 Advanced filtering controls
    Client-side filters for searchGroupKeyword, searchGroupYear, and onlyPostsNewerThan to refine results quickly.

  • ⚡ Anti-blocking delays
    Adds randomized delays and exponential backoff to reduce detection during scraping.

  • 🧩 Rich, normalized output
    Extracts post identifiers, author info, text, reactions (including reactionLikeCount and reactionLoveCount), comments (topComments), attachments, and group-level metadata.

  • 🛠️ Fallback doc_id support
    Configure fallbackDocId if Facebook’s frontend changes and automatic discovery temporarily fails.

How to use Facebook Groups Scraper - step by step

  1. Create or sign in to your Apify account.
  2. Open the Facebook Groups Scraper actor in the Apify Console.
  3. Add input:
    • Paste one or more public group URLs into startUrls (string list).
    • Optionally set resultsLimit (default 20) to control how many posts you want.
    • Choose a viewOption: CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, or CHRONOLOGICAL_LISTINGS.
    • Use searchGroupKeyword and searchGroupYear to narrow by keyword/letter and year.
    • Set onlyPostsNewerThan to restrict results to a specific date or relative time window.
  4. (Advanced) If you see a “Missing doc_id” error, set fallbackDocId to a known working GraphQL doc_id.
  5. Start the run and monitor logs:
    • The actor will automatically use Apify Residential proxy and retry until your target number of posts is reached or retries are exhausted.
  6. View results:
    • Go to the Dataset tab to see posts as they stream in.
    • Export your data as JSON, CSV, or Excel for downstream analysis.
  7. Pro Tip: Combine onlyPostsNewerThan with a one- or two-letter searchGroupKeyword for focused, high-signal datasets.

Use cases

Use caseDescription
Social listening for marketing teamsTrack discussions and engagement trends in public groups to inform campaigns and messaging.
Competitive intelligenceMonitor top posts and recent activity across niche communities to benchmark content performance.
Academic & policy researchCollect structured, time-bounded datasets for longitudinal studies and qualitative analysis.
Community management analyticsMeasure reactions, comments, and share counts to understand community health and interests.
Lead generation in buy/sell verticalsUse CHRONOLOGICAL_LISTINGS to scan marketplace-like groups for timely opportunities.
Content strategy & benchmarkingIdentify top-performing posts (TOP_POSTS) to inspire your editorial calendar.
Data enrichment pipelinesExport normalized JSON/CSV from the dataset for integration with BI tools or internal dashboards.

Why choose Facebook Groups Scraper?

This actor is built for precision, automation, and resilient data collection from public Facebook groups.

  • ✅ Accurate parameter discovery: Automatically finds node_id, doc_id, and end_cursor for stable scraping.
  • 🌐 Residential proxy reliability: Always uses Apify Residential proxy with automatic retries to meet your resultsLimit.
  • 🔎 Powerful filtering: Use keyword/letter searches, year filters, and date thresholds to target what matters.
  • ⚙️ Developer-friendly output: Clean, consistent JSON structure with normalized post, reaction, comment, and group fields.
  • 📦 Real-time streaming: Saves items to the dataset during the run for immediate access and incremental processing.
  • 🛡️ Safer than ad-hoc tools: Avoid brittle browser extensions and manual copy-paste with a production-grade Apify actor.
  • 📈 Export-ready: Download as JSON, CSV, or Excel from the dataset for instant analysis and reporting.

In short, it’s a dependable Facebook group data extractor — not a fragile workaround.

Yes, when done responsibly. This actor is designed for extracting publicly available data from public Facebook groups. It does not access authenticated or private group content.

Guidelines for responsible use:

  • Collect only publicly available group data.
  • Respect platform terms and applicable data protection laws (e.g., GDPR, CCPA).
  • Avoid using the data for spam or abusive outreach.
  • Consult your legal team for edge cases or jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://www.facebook.com/groups/germtheory.vs.terraintheory"
],
"resultsLimit": 100,
"viewOption": "RECENT_ACTIVITY",
"searchGroupKeyword": "a",
"searchGroupYear": "2024",
"onlyPostsNewerThan": "2 months",
"fallbackDocId": "",
"proxyConfiguration": {
"useApifyProxy": false
}
}

Input parameters

  • startUrls (array, required)

    • Description: Add one or more public Facebook group URLs. Only public groups are supported.
    • Default: none
    • Required: Yes
  • resultsLimit (integer)

    • Description: Maximum number of posts to scrape from each group.
    • Default: 20 (minimum: 1)
    • Required: No
  • viewOption (string)

    • Description: Post sorting strategy. Options: CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, CHRONOLOGICAL_LISTINGS. Note: Post limit applies to 'New posts' sorting only.
    • Default: CHRONOLOGICAL
    • Required: No
  • searchGroupKeyword (string)

    • Description: Search posts by keyword or letter. Without login, search is limited; 1–2 letters recommended.
    • Default: "" (empty string)
    • Required: No
  • searchGroupYear (string)

    • Description: Filter posts from a specific year (e.g., 2024). Best used together with the Search Keyword field.
    • Default: "" (empty string)
    • Required: No
  • onlyPostsNewerThan (string)

    • Description: Stop scraping when posts are older than this date. Supports absolute (YYYY-MM-DD) or relative (“2 weeks”, “7 days”, “1 month”).
    • Default: "" (empty string)
    • Required: No
  • fallbackDocId (string)

    • Description: Fallback GraphQL doc_id if automatic discovery fails (e.g., after a Facebook frontend update).
    • Default: "" (empty string)
    • Required: No
  • proxyConfiguration (object)

    • Description: Proxy settings editor. Note: The actor always uses Apify Residential proxy internally and retries until data demand is fulfilled.
    • Default: { "useApifyProxy": false }
    • Required: No

Example JSON output (one post)

{
"facebookUrl": "https://www.facebook.com/groups/germtheory.vs.terraintheory",
"url": "https://www.facebook.com/groups/123456789/permalink/987654321",
"time": "2024-03-15T10:30:00.000Z",
"user": {
"id": "10001234567890",
"name": "Jane Doe"
},
"text": "Looking for recommendations on analytics tools…",
"topReactionsCount": 140,
"feedbackId": "1122334455667788",
"reactionLikeCount": 120,
"reactionLoveCount": 20,
"id": "123456789012345",
"legacyId": "987654321098765",
"attachments": [
{
"thumbnail": "https://example-cdn.com/thumb.jpg",
"__typename": "Photo",
"is_playable": false,
"image": {
"uri": "https://example-cdn.com/image.jpg",
"height": 1080,
"width": 1080
},
"id": "445566778899",
"__isMedia": "Photo",
"photo_cix_screen": null,
"copyright_banner_info": null,
"owner": {
"__typename": "User",
"id": "10001234567890"
},
"ocrText": "Sample caption text"
}
],
"likesCount": 142,
"sharesCount": 7,
"commentsCount": 56,
"topComments": [
{
"commentUrl": "https://www.facebook.com/groups/123456789/permalink/987654321/?comment_id=555444333",
"id": "555444333",
"feedbackId": "9988776655",
"date": "2024-03-15T11:00:00.000Z",
"text": "We use Tool X and love it!",
"profileUrl": "https://www.facebook.com/john.smith",
"profilePicture": "https://example-cdn.com/profile.jpg",
"profileId": "1000234567890",
"profileName": "John Smith",
"likesCount": "12",
"threadingDepth": 0
}
],
"facebookId": "123456789012345",
"groupTitle": "Example Community Group",
"pageAdLibrary": {
"is_business_page_active": false,
"id": "123456789012345"
},
"inputUrl": "https://www.facebook.com/groups/germtheory.vs.terraintheory"
}

Notes:

  • Some fields may be null or empty if not present in the source (e.g., time when timestamp is unavailable, attachments when none exist, groupTitle for certain posts).

FAQ

Does this work on private groups?

No. The actor is designed for public Facebook groups. Private groups require authentication, and the input explicitly targets public group URLs.

Do I need to log in or add cookies?

No login is required for scraping public groups. However, search is limited without login; for searchGroupKeyword, using one or two letters often yields better results. Full-word searches may return nothing in most cases.

What sorting modes are supported?

The viewOption input supports CHRONOLOGICAL, RECENT_ACTIVITY, TOP_POSTS, and CHRONOLOGICAL_LISTINGS. This lets you collect posts by latest activity, top-performing content, or marketplace-like chronological listings.

How many posts can I collect per run?

Use resultsLimit to set your target. The actor paginates through the group feed and will retry with residential proxies until your data demand is fulfilled or retry limits are reached. The default is 20.

What data fields are included in the output?

Each post includes facebookUrl, url, time, id, legacyId, feedbackId, user (id, name), text, attachments, likesCount, sharesCount, commentsCount, topReactionsCount, reactionLikeCount, reactionLoveCount, topComments (with commenter details), facebookId, groupTitle, pageAdLibrary, and inputUrl.

How does the actor handle proxies?

It always uses Apify Residential proxy, obtained programmatically at runtime, and applies retry logic and randomized delays to reduce blocking and improve completion rates.

What if the actor reports “Missing doc_id”?

Set fallbackDocId in the input. This provides a known working GraphQL doc_id when automatic discovery fails (e.g., after a Facebook frontend update).

Is there a free trial or pricing?

Yes. The listing includes a flat monthly price of $19.99 and provides 120 trial minutes to evaluate the actor before committing to full usage.

Closing CTA / Final thoughts

Facebook Groups Scraper is built to turn public Facebook group activity into clean, structured datasets you can trust. With automatic discovery of required GraphQL parameters, reliable residential proxy usage, flexible sorting/filters, and real-time dataset streaming, it’s ideal for marketers, analysts, developers, and researchers. Export results as JSON/CSV for analytics, enrichment, or workflow automation — and use fallbackDocId for resilience when Facebook changes its frontend. Start extracting smarter community insights today with a production-ready pipeline.