# Facebook Page Scraper (`khadinakbar/facebook-page-scraper`) Actor

Scrape public Facebook page/profile URLs into clean page profile records plus optional recent post records. Extracts names, categories, contacts, websites, followers, likes, ratings, ad status, post text, timestamps, engagement counts, media, and links. No login required. MCP-ready.

- **URL**: https://apify.com/khadinakbar/facebook-page-scraper.md
- **Developed by:** [Khadin Akbar](https://apify.com/khadinakbar) (community)
- **Categories:** Social media, Lead generation, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 page scrapeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Facebook Page Scraper

Scrape public Facebook page/profile URLs into clean JSON records. The actor returns one `page` record per source and, when enabled, recent `post` records from that page.

It is built for lead enrichment, brand monitoring, competitor tracking, social listening, and AI-agent workflows that need structured public Facebook page data without managing browser sessions or cookies.

### What it extracts

Page records can include:

- `pageUrl`, `pageId`, `pageName`, and `username`
- `category`, `intro`, and `about`
- `website`, `email`, `phone`, and `address`
- `rating`, `ratingCount`, `likeCount`, and `followerCount`
- `priceRange`, `services`, `creationDate`, and `businessHours`
- `profilePictureUrl`, `coverPhotoUrl`, and ad-library status

Post records can include:

- `postUrl` and `postId`
- `pageName`, `pageUrl`, `authorName`, and `authorUrl`
- `text`
- `timestamp` and `timestampText`
- `reactionsCount`, `commentsCount`, and `sharesCount`
- `media`
- `externalLinks`
- `scrapeSource`
- `scrapedAt`

Provider mode uses public-data APIs when owner API keys are configured. HTML fallback uses mobile/basic Facebook pages when provider data is unavailable. Facebook can still hide, delete, age-gate, or login-gate content; those sources are reported in the `OUTPUT` summary instead of hard-failing the run.

### Input

```json
{
  "startUrls": [
    { "url": "https://www.facebook.com/NASA" },
    { "url": "https://www.facebook.com/meta" }
  ],
  "resultsLimit": 100,
  "includePosts": true,
  "maxPostsPerPage": 25,
  "getBusinessHours": false,
  "fallbackProvider": "auto",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}
````

### Output examples

Page record:

```json
{
  "recordType": "page",
  "sourceUrl": "https://www.facebook.com/NASA",
  "pageUrl": "https://www.facebook.com/NASA",
  "pageId": "12345",
  "pageName": "NASA",
  "username": "NASA",
  "category": "Government organization",
  "website": "https://www.nasa.gov/",
  "email": "public-info@nasa.gov",
  "followerCount": 27000000,
  "likeCount": 25000000,
  "isRunningAds": false,
  "scrapeSource": "scrapecreators",
  "scrapedAt": "2026-06-10T00:00:00.000Z"
}
```

Post record:

```json
{
  "recordType": "post",
  "sourceUrl": "https://www.facebook.com/NASA",
  "pageUrl": "https://www.facebook.com/NASA",
  "pageName": "NASA",
  "postUrl": "https://www.facebook.com/NASA/posts/123456789",
  "postId": "123456789",
  "text": "A new image from space...",
  "timestamp": "2026-06-01T00:00:00.000Z",
  "reactionsCount": 1200,
  "commentsCount": 34,
  "sharesCount": 5,
  "media": ["https://scontent.xx.fbcdn.net/image.jpg"],
  "externalLinks": ["https://www.nasa.gov/"],
  "scrapeSource": "scrapecreators",
  "scrapedAt": "2026-06-10T00:00:00.000Z"
}
```

### Notes

- Public data only. Private pages, deleted pages, age-gated pages, and login-only posts may not return results.
- Provider mode is optional. Set `SCRAPECREATORS_API_KEY`, `SOCIAVAULT_API_KEY`, or `SOCIALVAULT_API_KEY` in the actor environment and keep `fallbackProvider` as `auto` to use public-data APIs.
- HTML fallback uses Residential Apify Proxy by default because Facebook often blocks datacenter IPs.
- Date filters apply to post records only. Page records are always emitted when a page can be extracted.
- `includeRawHtml` is for debugging only and increases dataset size.

### Pricing

Pay-per-event configuration:

- `apify-actor-start`: `$0.00005`
- `page-scraped`: `$0.01` per page/profile record
- `post-scraped`: `$0.0035` per post record

`resultsLimit` and `maxPostsPerPage` are the primary user-facing cost caps.

### Local development

```bash
npm install
npm test
apify run --input='{"startUrls":[{"url":"https://www.facebook.com/NASA"}],"resultsLimit":10,"includePosts":true,"maxPostsPerPage":3}'
```

# Actor input Schema

## `startUrls` (type: `array`):

One or more public Facebook page or profile URLs. Direct post, group, marketplace, and search URLs are ignored because this actor is scoped to pages/profiles. Private, deleted, age-gated, or login-only pages may return no data.

## `resultsLimit` (type: `integer`):

Maximum number of dataset records to emit across page and post records. Use this as the primary cost cap.

## `includePosts` (type: `boolean`):

When enabled, the actor emits recent post records for each page after the page profile record.

## `maxPostsPerPage` (type: `integer`):

Maximum number of recent public posts to collect from each page. Set to 0 with Include recent posts disabled for pure page enrichment.

## `scrapePostDetails` (type: `boolean`):

HTML fallback option. Opens each discovered post URL to enrich media, outbound links, and engagement counts. Provider-backed posts are already returned as structured records.

## `getBusinessHours` (type: `boolean`):

Provider option. Requests business hours for pages when the public provider endpoint exposes them.

## `fallbackProvider` (type: `string`):

auto uses SCRAPECREATORS\_API\_KEY first and SOCIAVAULT\_API\_KEY second when configured, then falls back to public HTML. none disables provider calls and uses public HTML only.

## `onlyPostsNewerThan` (type: `string`):

Optional post date filter. Accepts ISO dates such as 2026-01-01 or natural JavaScript date strings. Posts with unparseable timestamps are kept.

## `onlyPostsOlderThan` (type: `string`):

Optional upper post date filter. Accepts ISO dates such as 2026-06-01 or natural JavaScript date strings. Posts with unparseable timestamps are kept.

## `includeRawHtml` (type: `boolean`):

Debug option for HTML fallback. Adds a truncated rawHtml field to page/detail records. Leave disabled for normal runs because raw HTML increases dataset size.

## `proxyConfiguration` (type: `object`):

Proxy settings for HTML fallback requests. Residential Apify Proxy is the default because Facebook often blocks datacenter IPs and local network ranges.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.facebook.com/NASA"
    },
    {
      "url": "https://www.facebook.com/meta"
    }
  ],
  "resultsLimit": 250,
  "includePosts": true,
  "maxPostsPerPage": 50,
  "scrapePostDetails": false,
  "getBusinessHours": false,
  "fallbackProvider": "auto",
  "onlyPostsNewerThan": "2026-01-01",
  "onlyPostsOlderThan": "2026-06-01",
  "includeRawHtml": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

## `summary` (type: `string`):

No description

## `runSummary` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.facebook.com/NASA"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("khadinakbar/facebook-page-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.facebook.com/NASA" }] }

# Run the Actor and wait for it to finish
run = client.actor("khadinakbar/facebook-page-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.facebook.com/NASA"
    }
  ]
}' |
apify call khadinakbar/facebook-page-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=khadinakbar/facebook-page-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Facebook Page Scraper",
        "description": "Scrape public Facebook page/profile URLs into clean page profile records plus optional recent post records. Extracts names, categories, contacts, websites, followers, likes, ratings, ad status, post text, timestamps, engagement counts, media, and links. No login required. MCP-ready.",
        "version": "1.0",
        "x-build-id": "7IyYiqEMCg6Lf4OVr"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/khadinakbar~facebook-page-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-khadinakbar-facebook-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/khadinakbar~facebook-page-scraper/runs": {
            "post": {
                "operationId": "runs-sync-khadinakbar-facebook-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/khadinakbar~facebook-page-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-khadinakbar-facebook-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Facebook page URLs",
                        "type": "array",
                        "description": "One or more public Facebook page or profile URLs. Direct post, group, marketplace, and search URLs are ignored because this actor is scoped to pages/profiles. Private, deleted, age-gated, or login-only pages may return no data.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "resultsLimit": {
                        "title": "Total records limit",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of dataset records to emit across page and post records. Use this as the primary cost cap.",
                        "default": 100
                    },
                    "includePosts": {
                        "title": "Include recent posts",
                        "type": "boolean",
                        "description": "When enabled, the actor emits recent post records for each page after the page profile record.",
                        "default": true
                    },
                    "maxPostsPerPage": {
                        "title": "Max posts per page",
                        "minimum": 0,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of recent public posts to collect from each page. Set to 0 with Include recent posts disabled for pure page enrichment.",
                        "default": 25
                    },
                    "scrapePostDetails": {
                        "title": "Open post details in HTML fallback",
                        "type": "boolean",
                        "description": "HTML fallback option. Opens each discovered post URL to enrich media, outbound links, and engagement counts. Provider-backed posts are already returned as structured records.",
                        "default": false
                    },
                    "getBusinessHours": {
                        "title": "Get business hours",
                        "type": "boolean",
                        "description": "Provider option. Requests business hours for pages when the public provider endpoint exposes them.",
                        "default": false
                    },
                    "fallbackProvider": {
                        "title": "Provider mode",
                        "enum": [
                            "auto",
                            "none",
                            "scrapeCreators",
                            "sociaVault"
                        ],
                        "type": "string",
                        "description": "auto uses SCRAPECREATORS_API_KEY first and SOCIAVAULT_API_KEY second when configured, then falls back to public HTML. none disables provider calls and uses public HTML only.",
                        "default": "auto"
                    },
                    "onlyPostsNewerThan": {
                        "title": "Only posts newer than",
                        "type": "string",
                        "description": "Optional post date filter. Accepts ISO dates such as 2026-01-01 or natural JavaScript date strings. Posts with unparseable timestamps are kept."
                    },
                    "onlyPostsOlderThan": {
                        "title": "Only posts older than",
                        "type": "string",
                        "description": "Optional upper post date filter. Accepts ISO dates such as 2026-06-01 or natural JavaScript date strings. Posts with unparseable timestamps are kept."
                    },
                    "includeRawHtml": {
                        "title": "Include raw HTML",
                        "type": "boolean",
                        "description": "Debug option for HTML fallback. Adds a truncated rawHtml field to page/detail records. Leave disabled for normal runs because raw HTML increases dataset size.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings for HTML fallback requests. Residential Apify Proxy is the default because Facebook often blocks datacenter IPs and local network ranges.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
