# Facebook Scraper (`morph_coder/facebook-scraper`) Actor

Scrape public Facebook content without login.

- **URL**: https://apify.com/morph\_coder/facebook-scraper.md
- **Developed by:** [Morph Coder](https://apify.com/morph_coder) (community)
- **Categories:** Social media, Automation
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $5.00 / 1,000 posts

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Facebook Post Scraper

### What is Facebook Post Scraper?

Facebook Post Scraper is a lightweight Apify Actor that extracts **public Facebook posts from direct post URLs**. Paste one or more links to individual posts (or post IDs), run the Actor, and get structured JSON with text, engagement metrics, and media metadata—without using the official Facebook API.

It is built for **known post links**: monitoring specific announcements, archiving posts you already found, or feeding URLs from search/other tools into a pipeline. It uses fast HTTP scraping (Cheerio), not a full browser, so runs are cheap and quick when Facebook serves public HTML.

**Best for:** post URLs you already have · public pages and profiles · batch export to JSON/CSV via Apify  
**Not for:** crawling an entire page feed by profile URL only · hashtag search · logged-in-only insights

---

### Why use this Actor?

- **Direct post URLs** — one row per `startUrls` entry; no Graph API app review
- **Full post text** when Facebook embeds `message.text` in the page (not truncated `og:description`)
- **Engagement** — reactions, comments, shares; `views` when present in public HTML (often `null` without login)
- **Media** — image/video URLs, thumbnails, optional **ocrText** (Facebook accessibility captions, not a separate vision model)
- **Optional video transcript** — from embedded captions when `videoTranscript: true`
- **Optional Apify Proxy** — residential proxy if datacenter IPs are blocked
- **Export anywhere** — dataset on Apify, or JSON/CSV/Excel and API clients (Python/Node)

---

### What data can you extract?

| Field | Description |
|--------|-------------|
| `text` | Post body (full text from embedded JSON when available) |
| `url` | Canonical post URL when found in HTML |
| `postId` | Numeric or `pfbid` id |
| `reactions` / `likes` | Reaction count (`likes` mirrors `reactions`) |
| `comments` | Comment count |
| `shares` | Share count |
| `views` | View/play count if embedded (often `null` on public pages) |
| `publishedAt` | ISO timestamp when available |
| `authorName` | From Open Graph / page metadata when available |
| `ocrText` | Image accessibility text (e.g. “May be an image of…”) |
| `videoTranscript` | Auto captions from HTML when enabled and present |
| `media[]` | `url`, `thumbnail`, `ocrText`, `mediaType` (`photo` / `video`) |
| `sourceUrl` | Your input URL |
| `scrapedAt` | Extraction time |

Errors are returned as `{ "type": "error", "code", "message" }` (e.g. `ACCESS_DENIED`, `NOT_FOUND`).

---

### How do I use Facebook Post Scraper?

1. Create a free [Apify account](https://console.apify.com/sign-up).
2. Open this Actor in Apify Console.
3. Add **post URLs** under **Start URLs** (or `postUrls` / `postIds`).
4. Set **Results limit** (max URLs per run, default 20).
5. Optionally enable **Image captions / OCR** or **Video transcript**, and **proxy** if you see login walls.
6. Click **Start**, then download the dataset (JSON, CSV, Excel, etc.).

#### Input

Primary input is a list of **public post URLs**, for example:

```json
{
  "startUrls": [
    {
      "url": "https://www.facebook.com/61559402542547/posts/breaking-scottie-thompson-has-been-traded-to-magnoliais-reportedly-being-traded-/122212876880313418/"
    }
  ],
  "resultsLimit": 20,
  "captionText": true,
  "videoTranscript": false,
  "useProxy": false
}
````

| Field | Description |
|-------|-------------|
| `startUrls` | Post links (`requestListSources` editor) |
| `postUrls` | Alias for `startUrls` |
| `postIds` | Post URL, numeric id, or `pfbid` token |
| `resultsLimit` | Max posts to process per run (1–500) |
| `captionText` | Include `ocrText` on images (default `true`) |
| `videoTranscript` | Include embedded video transcript (default `false`) |
| `useProxy` / `proxy` | Apify Proxy; `useApifyProxy` in proxy editor also enables proxy |

See the **Input** tab in Console for the full JSON schema.

#### Output

Results are stored in the run **Dataset**. Example post record:

```json
{
  "url": "https://www.facebook.com/permalink.php?story_fbid=…",
  "text": "🚨BREAKING: …\n\nMagnolia will get:\nScottie Thompson",
  "reactions": 56,
  "comments": 12,
  "shares": 19,
  "views": null,
  "likes": 56,
  "ocrText": "May be an image of basketball, basketball jersey and text that says '…'",
  "videoTranscript": null,
  "media": [
    {
      "url": "https://scontent.xx.fbcdn.net/…",
      "thumbnail": "https://scontent.xx.fbcdn.net/…",
      "ocrText": "May be an image of basketball…",
      "mediaType": "photo"
    }
  ],
  "postId": "122212876880313418",
  "authorName": null,
  "publishedAt": "2024-01-15T12:00:00.000Z",
  "sourceUrl": "https://www.facebook.com/…/posts/…",
  "scrapedAt": "2026-06-01T15:14:00.529Z"
}
```

***

### Proxy and access

Facebook may return a login wall for cloud IPs. If you see `ACCESS_DENIED`:

1. Set **Use Apify proxy** to `true`, or enable **use Apify Proxy** in the proxy editor.
2. Prefer **RESIDENTIAL** proxy groups for difficult posts.
3. Confirm the post is **public** and the URL opens in a private browser without logging in.

This Actor does **not** accept Facebook cookies or passwords (public content only).

***

### API and integrations

Run the Actor via [Apify API](https://docs.apify.com/api/v2), `apify-client` (Node.js / Python), webhooks, or integrations (Make, Zapier, Google Sheets, etc.). Use the **API** tab in Console for a ready-made code snippet.

***

### Pricing

This Actor uses [pay-per-event](https://docs.apify.com/platform/actors/publishing/monetize/pay-per-event) billing:

| Event | When charged |
|-------|----------------|
| `apify-actor-start` | Each run (synthetic; configured in Console) |
| **`post`** | Each **successful** post written to the dataset |

Error rows (`type: "error"`) are **not** charged. Set the `post` price in **Publication → Monetization** and mark it as the primary event.

Local test with PPE: `ACTOR_TEST_PAY_PER_EVENT=1 apify run`.

***

### FAQ

**Can I scrape all posts from a page by pasting only `https://www.facebook.com/pagename/`?**\
No. This Actor expects **post URLs** (or post IDs). To collect many posts from a page, you need those links from another source or a feed-oriented Actor.

**How is this different from large “Facebook Posts Scraper” products on the store?**\
Those often crawl page feeds, filters, and rich page metadata in a browser. This Actor is a **focused, URL-in → post-out** tool: fast, transparent, and ideal when you already have post links.

**Is `ocrText` from ChatGPT / vision AI?**\
No. It is read from Facebook’s own accessibility / OCR fields in the page JSON when available.

**Why is `views` null?**\
Facebook usually does not expose view counts on anonymous HTML; they may appear for some videos or with a logged-in session.

**Is scraping legal?**\
Only scrape public data you are allowed to use. You are responsible for compliance with [Meta’s Terms](https://www.facebook.com/terms.php) and applicable law. Do not scrape private or personal data without a lawful basis.

***

### Develop locally

```bash
cd actors/facebook-scraper
npm install
npm run build
npm run test:unit
apify run
```

Example input: `scripts/test-input.example.json`

### Deploy

```bash
cd actors/facebook-scraper
apify login
apify push
```

CI path: `actors/facebook-scraper` (use forward slashes on Linux).

# Actor input Schema

## `startUrls` (type: `array`):

Direct links to public Facebook posts (one post per URL).

## `postUrls` (type: `array`):

Same as startUrls.

## `postIds` (type: `array`):

Numeric post IDs, pfbid tokens, or post URLs.

## `resultsLimit` (type: `integer`):

Maximum number of post URLs to process per run.

## `useProxy` (type: `boolean`):

Enable if Facebook blocks datacenter IPs. Proxy editor (useApifyProxy) also enables proxy.

## `proxy` (type: `object`):

Optional Apify proxy.

## `maxRetries` (type: `integer`):

HTTP retry attempts per failed request.

## `maxConcurrency` (type: `integer`):

Maximum parallel Cheerio requests.

## `captionText` (type: `boolean`):

Include Facebook accessibility OCR text (ocrText) on images when present in HTML. No external vision API.

## `videoTranscript` (type: `boolean`):

Include auto-generated video transcript/captions when Facebook embeds them in the page HTML.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.facebook.com/61559402542547/posts/breaking-scottie-thompson-has-been-traded-to-magnoliais-reportedly-being-traded-/122212876880313418/"
    }
  ],
  "resultsLimit": 20,
  "useProxy": false,
  "proxy": {
    "useApifyProxy": false
  },
  "maxRetries": 3,
  "maxConcurrency": 5,
  "captionText": true,
  "videoTranscript": false
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset items from this run.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.facebook.com/61559402542547/posts/breaking-scottie-thompson-has-been-traded-to-magnoliais-reportedly-being-traded-/122212876880313418/"
        }
    ],
    "proxy": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("morph_coder/facebook-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.facebook.com/61559402542547/posts/breaking-scottie-thompson-has-been-traded-to-magnoliais-reportedly-being-traded-/122212876880313418/" }],
    "proxy": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("morph_coder/facebook-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.facebook.com/61559402542547/posts/breaking-scottie-thompson-has-been-traded-to-magnoliais-reportedly-being-traded-/122212876880313418/"
    }
  ],
  "proxy": {
    "useApifyProxy": false
  }
}' |
apify call morph_coder/facebook-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=morph_coder/facebook-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Facebook Scraper",
        "description": "Scrape public Facebook content without login.",
        "version": "0.0",
        "x-build-id": "RObkd03aGbZ7xliJC"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/morph_coder~facebook-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-morph_coder-facebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/morph_coder~facebook-scraper/runs": {
            "post": {
                "operationId": "runs-sync-morph_coder-facebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/morph_coder~facebook-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-morph_coder-facebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Direct links to public Facebook posts (one post per URL).",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "postUrls": {
                        "title": "Post URLs (alias)",
                        "type": "array",
                        "description": "Same as startUrls.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "postIds": {
                        "title": "Post IDs",
                        "type": "array",
                        "description": "Numeric post IDs, pfbid tokens, or post URLs.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "resultsLimit": {
                        "title": "Results limit",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of post URLs to process per run.",
                        "default": 20
                    },
                    "useProxy": {
                        "title": "Use Apify proxy",
                        "type": "boolean",
                        "description": "Enable if Facebook blocks datacenter IPs. Proxy editor (useApifyProxy) also enables proxy.",
                        "default": false
                    },
                    "proxy": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional Apify proxy."
                    },
                    "maxRetries": {
                        "title": "Max retries",
                        "minimum": 1,
                        "maximum": 5,
                        "type": "integer",
                        "description": "HTTP retry attempts per failed request.",
                        "default": 3
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Maximum parallel Cheerio requests.",
                        "default": 5
                    },
                    "captionText": {
                        "title": "Image captions / OCR",
                        "type": "boolean",
                        "description": "Include Facebook accessibility OCR text (ocrText) on images when present in HTML. No external vision API.",
                        "default": true
                    },
                    "videoTranscript": {
                        "title": "Video transcript",
                        "type": "boolean",
                        "description": "Include auto-generated video transcript/captions when Facebook embeds them in the page HTML.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
