Facebook Pages Scraper avatar

Facebook Pages Scraper

Pricing

$1.00 / 1,000 results

Go to Apify Store
Facebook Pages Scraper

Facebook Pages Scraper

💰$1/1K result💰Stop wasting your budget on slow, resource-heavy browser-based scrapers. This is the fastest, most cost-effective, and data-rich Facebook Pages scraper on Apify, designed for high-scale lead generation, reputation monitoring, and competitor research.

Pricing

$1.00 / 1,000 results

Rating

5.0

(2)

Developer

VortexData

VortexData

Maintained by Community

Actor stats

1

Bookmarked

29

Total users

19

Monthly active users

4 days ago

Last modified

Share

📘 Facebook Pages Scraper

Extract Facebook Page data — page details, posts, photos, videos, reels, reviews, and events when Facebook exposes them — from one or more pages in a single run. It works anonymously by default, and can optionally use cookies from your own Facebook session for pages your account is allowed to view.

You give the Actor a list of Facebook page URLs. It returns one readable record per page in a competitor-compatible one-row-per-page shape, plus useful status, fallback, and authenticated-mode fields.


What does Facebook Pages Scraper do?

For every Facebook page you provide, the Actor returns a single dataset record. There is no input selector for individual sections: the Actor automatically attempts every supported section for each page, then keeps only the data Facebook exposes to anonymous viewers or to your supplied session. The row can contain:

  • 🏷️ Page details — title, page ID, handle, categories, intro, address, phone, email, website, hours, rating (% recommend + review count), follower / likes / following counts, profile photo, cover photo, page creation date, ad status, ad library id, linked Instagram, and more (≈40 fields).
  • 📝 Recent posts — text, URL, publish timestamp, reactions / comments / shares counts, author, attached media.
  • 🖼️ Photos — IDs and CDN URLs for the most recent uploads.
  • 🎥 Videos — title-less videos with description, play count, publish time, permalink.
  • 🎞️ Reels — short-form video URLs with timestamps.
  • Reviews — public recommendations (when the page has any).
  • 📅 Events — upcoming and past events with names, URLs, and venues.

Each section is a nested array inside the main page record, so the dataset stays one row per URL — easy to filter, view, and export.


Why use this Actor?

  • Anonymous by default. It works without Facebook cookies for ordinary public pages.
  • Optional authenticated mode. Paste cookies from your own logged-in Facebook browser session to collect pages that your account can view. Cookie values are treated as secret input and are never written to logs, dataset rows, or OUTPUT.
  • One row per page. Compatible with the popular community Facebook Pages Scraper's page-centric output shape, with extra status and recovery fields.
  • All data types in a single run. The input has no type filter; the Actor attempts page details, posts, photos, videos, reels, reviews, and events automatically.
  • Fast. A typical page returns in 6–12 seconds on Apify Datacenter US, even with full extraction.
  • Cheap unavailable handling. When Facebook says a page is missing or refuses to render it, the Actor returns a clear not_available row and reason instead of spending extra proxy traffic on external recovery sources.
  • Apify-native. Works with the standard startUrls input format, dataset views, schedules, integrations, and REST API.

Common use cases

  • Lead enrichment — pull contacts and rating signals for local-business pages.
  • Reputation monitoring — track review counts and overall rating over time.
  • Social-media tracking — collect recent posts, videos, and reels across competitor pages.
  • Event monitoring — list upcoming and past events for venues, restaurants, or brands.

How to use it

  1. Open the Actor on Apify and switch to the Input tab.
  2. Paste one or more Facebook page URLs into Start URLs (each entry is a {"url": "https://www.facebook.com/<handle>/"} object). You can also use the Page handles or IDs field with bare handles like copperkettleyqr or numeric page IDs — both formats are merged.
  3. Pick Max items per page section (default 50) — it caps how many posts / photos / videos are kept per section. It does not choose which sections to scrape; the Actor always attempts all supported sections. Page details are collected once when Facebook exposes them.
  4. (Optional) Toggle Paginate available items (parseAllResults) to paginate older items when Facebook provides authenticated pagination tokens. In anonymous mode, the Actor keeps what the public initial HTML exposes but cannot force logged-out deep pagination.
  5. (Optional) Paste your own Facebook session cookies into Facebook cookies if you need authenticated mode.
  6. Click Save & Start.

The Actor uses Apify Datacenter US Proxy internally — no configuration needed, no settings to tweak.

When the run finishes, open Storage → Dataset to view, filter, or export the results as JSON / CSV / Excel / XML / HTML.


Input

FieldTypeRequiredDescription
startUrlsarray<{url}>yes¹List of full Facebook Page URLs. Each entry is {"url": "https://www.facebook.com/<handle>/"}. Groups and personal profiles are not supported. Numeric profile.php?id=... URLs are accepted as possible Page IDs, but personal profiles may return unavailable.
urlsarray<string>yes¹Alternative input — accepts handles (copperkettleyqr) or numeric IDs (100064027242849).
maxResultsintegernoHow many items per section to keep (default 50, max 10000).
parseAllResultsbooleannoPaginate older section items when authenticated pagination tokens are available (default false). Anonymous runs are limited to the public initial HTML batch.
facebookCookiesstringnoOptional secret field for authenticated mode. Paste a Cookie header (c_user=...; xs=...) or a JSON cookie export. Must include c_user and xs.

¹ At least one of startUrls or urls must contain a value.

Example — minimal input

{
"startUrls": [
{ "url": "https://www.facebook.com/copperkettleyqr/" }
]
}

Example — multiple pages with a tighter cap

{
"startUrls": [
{ "url": "https://www.facebook.com/copperkettleyqr/" },
{ "url": "https://www.facebook.com/Microsoft/" }
],
"urls": [
"100064027242849"
],
"maxResults": 20
}

Optional authenticated mode with cookies

Use this only with a Facebook account you control or are authorized to use. Cookies act like a temporary session key; do not paste passwords, 2FA codes, or cookies from someone else's account.

Where to paste them:

  1. Open the Actor on Apify.
  2. Go to the Input tab.
  3. Scroll to Facebook cookies (optional).
  4. Paste either a raw Cookie header value or a JSON export.
  5. Start the run.

How to copy a Cookie header from Chrome / Edge:

  1. Log in to facebook.com in your browser.
  2. Open a Facebook Page that your account can view.
  3. Press F12 and open the Network tab.
  4. Reload the page.
  5. Click the first facebook.com document request.
  6. In Headers -> Request Headers, copy the value of Cookie.
  7. Paste that value into Facebook cookies (optional).

Expected format:

c_user=123456789; xs=...; fr=...; datr=...; sb=...

JSON exports from cookie extensions are also accepted:

[
{"domain": ".facebook.com", "name": "c_user", "value": "123456789"},
{"domain": ".facebook.com", "name": "xs", "value": "..."}
]

If the cookies are valid, the run log says Authenticated mode enabled, and dataset rows show resultStatus: "authenticated" when the main data came from that session. Cookies are sent only to facebook.com hosts. If Facebook logs the account out, rotates the session, or the account cannot view the target page, the Actor returns a not_available row and explains the result in resultSummary, nextAction, and OUTPUT.skippedPages.


Output

One record per input URL is pushed to the default Apify dataset. Each row starts with a readable status block (resultStatus, dataQuality, resultSummary, sourceSummary, contentCounts, nextAction) before the raw page fields. Per-section content lives in nested arrays.

Sample (truncated)

{
"resultStatus": "full_public",
"dataQuality": "Full",
"accessLevel": "public",
"resultSummary": "Full public page data collected: 22 posts, 8 photos, 6 videos, 3 reels.",
"sourceSummary": "Facebook public page",
"contentCounts": "22 posts, 8 photos, 6 videos, 3 reels",
"nextAction": "Ready to use.",
"url": "https://www.facebook.com/copperkettleyqr/",
"type": "page",
"facebookUrl": "https://www.facebook.com/copperkettleyqr/",
"pageUrl": "https://www.facebook.com/copperkettleyqr/",
"pageId": "100064027242849",
"facebookId": "100064027242849",
"pageName": "copperkettleyqr",
"displayName": "The Copper Kettle Restaurant",
"title": "The Copper Kettle Restaurant | Regina SK",
"intro": "Longstanding local restaurant. Mediterranean specialties...",
"info": [
"The Copper Kettle Restaurant, Regina. 3,212 likes",
"39 talking about this",
"1,136 were here. Longstanding local restaurant..."
],
"categories": ["Page", "Pizza place"],
"category": "Pizza place",
"likes": 3212,
"followers": 3212,
"followings": 341,
"talking_about": 39,
"were_here": 1136,
"rating": "94% recommend (202 Reviews)",
"ratings": "94% recommend (202 Reviews)",
"ratingOverall": 94,
"ratingCount": 202,
"phone": "+1 306-525-3545",
"email": "copperkettle.events@gmail.com",
"website": "http://www.thecopperkettle.online/",
"websites": [
"https://www.bing.com/maps/...",
"https://www.instagram.com/copperkettleyqr",
"http://www.thecopperkettle.online/"
],
"alternativeSocialMedia": "https://www.instagram.com/copperkettleyqr",
"instagram": [
{"username": "copperkettleyqr", "url": "https://www.instagram.com/copperkettleyqr"}
],
"address": "1953 Scarth Street, Regina, SK, Canada, S4P 2H1",
"addressUrl": "https://www.bing.com/maps/...",
"services": "Outdoor seating",
"business_services": "Outdoor seating",
"priceRange": "$$",
"business_price": "Price Range · $$",
"business_hours": "Open now",
"creation_date": "October 29, 2014",
"ad_status": "This Page is currently running ads.",
"pageAdLibrary": {"id": "851606664870954", "pamv_comms_data": null},
"profilePictureUrl": "https://lookaside.fbsbx.com/lookaside/crawler/media/?media_id=100064027242849",
"profilePhoto": "https://www.facebook.com/photo/?fbid=...&set=a....",
"coverPhotoUrl": "https://lookaside.fbsbx.com/lookaside/crawler/media/?media_id=...",
"verified": false,
"posts": [
{
"type": "post",
"post_id": "1259602942850602",
"url": "https://www.facebook.com/copperkettleyqr/posts/...",
"message": "THE SASKATCHEWAN ROUGHRIDERS WIN THE 112TH GREY CUP! 🏆",
"timestamp": 1763349499,
"reactions_count": 10,
"comments_count": 0,
"reshare_count": 1,
"author": {"id": "100064027242849", "name": "...", "url": "..."},
"media": ["https://lookaside.fbsbx.com/..."]
}
],
"photos": [ /* … */ ],
"videos": [ /* … */ ],
"reels": [ /* … */ ],
"past_events":[ /* … */ ],
"scraped_at": "2026-05-09T08:12:34.567+00:00"
}

If a section has zero records (e.g. the page has no public reviews), that key is omitted from the record entirely — keeping the JSON tight.


Pricing and runtime

The Actor is billed per Apify compute unit + Apify Proxy traffic — there are no per-result fees declared by the Actor itself. Internally the Actor uses Apify Datacenter US Proxy, the cheapest pool, which gives sub-second per-request latency for Facebook.

Typical runtime per URL:

WorkloadWall-clock
1 page, default maxResults=50~6–12 seconds
100 pages, default~10–20 minutes (sequential per page, 8 sections concurrent within each)
parseAllResults=true on a busy pageminutes per page when authenticated pagination is available

Each request rotates through a fresh exit IP, so you don't have to pre-provision sticky sessions.


How it works (under the hood)

  1. Resolve every input URL / handle / numeric id to a canonical Facebook page URL.
  2. For each page, fetch up to three Facebook endpoints concurrently: /about/ (page metadata), /about_profile_transparency/ (page creation date, ad status), and the page root / (profile photo viewer URL).
  3. For listing sections (posts, photos, videos, reels, reviews, future events, past events) fetch each section URL in parallel.
  4. Parse the embedded GraphQL JSON inside the HTML, extract structured records, and merge per-page results into one aggregate record.
  5. Strip empty arrays and null fields, then push exactly one polished record per URL to the dataset.

The Actor uses curl_cffi with Chrome TLS impersonation. In anonymous mode it uses the facebookexternalhit/1.1 user-agent — the same combination Facebook uses internally for link-preview crawling — to pull rich public page data without authentication. When facebookCookies is provided, it switches to a normal browser user-agent and sends those cookies only with Facebook requests so Facebook can render whatever that account is allowed to view.


Tips and best practices

  • Keep maxResults small (the default 50 is usually enough) when first testing the Actor on new pages.
  • Avoid parseAllResults=true unless you really need older items and are using cookies that expose pagination — busy pages with thousands of items can run for many minutes per page.
  • Run on a schedule for monitoring: the Apify Console lets you trigger this Actor every X hours and pipe new dataset items into webhooks, integrations, or your own infrastructure.

FAQ

Does this Actor scrape personal Facebook profiles or Groups? No. It is for Facebook Pages only. /groups/ URLs are rejected. Numeric profile.php?id=... URLs are accepted as possible Page IDs because some Pages use that shape, but personal profiles are unsupported and will usually return an unavailable row rather than profile data.

Why are some fields like past_events[*].timestamp null? Facebook's bundle does not ship every field for every record. We fill what FB serves; missing fields are omitted entirely from the output to keep records clean. Authenticated mode can expose more data when your account is allowed to view it, but it still cannot create fields Facebook does not send.

Can I provide just a page handle without the full URL? Yes. Use the Page handles or IDs input field for bare handles (copperkettleyqr) or numeric IDs (100064027242849).

The locale of the output isn't English. Facebook localises page output by the geo of the exit IP. The Actor pins the internal proxy to a United States exit so you always receive native en-US output (Reviews capitalised, $$ price symbols, mkt=en-US in map links).

Can it scrape pages with limited access? In anonymous mode it cannot read content that Facebook only shows after login or to a specific audience. To avoid extra proxy and compute cost, the Actor does not query external fallback sources for those unavailable pages. For fuller access, paste cookies from a Facebook account that is allowed to view the page into Facebook cookies (optional). If that account cannot view the page either, the Actor returns a not_available row plus a detailed reason in OUTPUT.skippedPages.

Are cookies stored or printed anywhere? No. The raw cookie value is only used as request input. Logs and OUTPUT include only a safe summary such as 3 cookie(s): c_user, xs, fr; dataset rows only say whether cookies were used.

Is this legal? You are responsible for using this Actor in compliance with applicable laws, Facebook's terms, and the rules of your specific use case. Only collect and use data you are allowed to process.


Run programmatically

The Apify Console auto-generates ready-to-copy code samples for every language (Python, JavaScript, cURL, etc.) on the Actor's API tab. Open the Actor page, switch to API → Run Actor synchronously / asynchronously and copy the snippet pre-filled with your token and Actor name — no manual setup required.


Support

Open an Issue on the Actor's Apify page if you spot a bug, want a new field added, or have a custom-version request.