Facebook Page Post Intelligence | Public Pages & Posts Scraper
Pricing
Pay per usage
Facebook Page Post Intelligence | Public Pages & Posts Scraper
Scrape public Facebook page metadata and posts without API keys. Get page followers, post text, reactions, shares, comments, and engagement signals for analytics and research.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
8 hours ago
Last modified
Categories
Share
Facebook Page Post Intelligence
Public Facebook page metadata and post scraper — no API keys, no login required.
Collect page-level intelligence (followers, likes, description, category, website) and recent public post signals (text, timestamps, reactions, comments, shares) from public Facebook Pages.
Status
Live V1 — Public page metadata extraction and recent post parsing with honest degraded/partial/blocked status reporting.
⚠️ Important: Facebook aggressively guards content behind login walls. This actor targets the subset of data that Facebook serves to logged-out visitors. Success rates vary by page, region, and Facebook's current challenge policy.
What this actor does
| Feature | V1 Status |
|---|---|
| Page name, category, description | ✅ When publicly served |
| Follower count, like count | ✅ When visible in HTML |
| Profile image URL | ✅ Via og:image |
| Website, contact info | ✅ When exposed |
| Verified badge | ✅ When embedded in page JSON |
| Recent post text | ✅ When served to logged-out visitors |
| Post timestamps | ✅ Parsed from embedded JSON |
| Post reactions/comments/shares | ✅ When exposed (often null) |
| Post media URLs | ✅ When public |
| Hashtag and mention extraction | ✅ Client-side regex from post text |
Explicitly out of scope (V1)
- Private profiles or personal timelines
- Facebook Groups or Events
- Stories, Reels, Live videos
- Full comment thread contents
- Ads (use Meta Ad Library Intelligence instead)
- Pagination beyond what is initially visible
- Any data requiring Facebook login
Input
{"pageUrls": ["https://www.facebook.com/NASA"],"maxPostsPerPage": 10,"includePosts": true,"includePageSummary": true,"timeoutMs": 20000,"delivery": "dataset","dryRun": false}
Input fields
| Field | Type | Default | Description |
|---|---|---|---|
pageUrls | string[] | [] | Public Facebook Page URLs or bare handles (e.g. https://www.facebook.com/NASA, NASA, @NASA) |
maxPostsPerPage | integer | 10 | Max recent posts to extract per page (1–50) |
includePosts | boolean | true | Whether to extract recent posts |
includePageSummary | boolean | true | Whether to extract page metadata |
timeoutMs | integer | 20000 | HTTP timeout per request in ms (5000–60000) |
delivery | string | "dataset" | "dataset" or "webhook" |
webhookUrl | string | — | Webhook endpoint when delivery="webhook" |
dryRun | boolean | false | Parse and validate but do not push to dataset |
Accepted URL formats
https://www.facebook.com/NASAhttps://fb.com/NASAfacebook.com/NASANASA(bare handle)@NASA(with @ prefix)
Output
Results are written to output/result.json and pushed to the Apify dataset (one item per post, or per page if no posts are found).
Top-level structure
{"meta": { ... },"pages": [ { ...page summary... } ],"posts": [ { ...post... } ]}
meta object
| Field | Type | Description |
|---|---|---|
generatedAt | string | ISO 8601 run timestamp |
implementationStatus | string | "live" |
dataStrategy | string | "public_html" |
totalPages | integer | Pages queued |
succeeded | integer | Pages with at least partial data |
failed | integer | Pages that errored |
postsCollected | integer | Total posts extracted |
v1Scope | string | Scope statement |
warnings | string[] | Run-level warnings |
notes | string[] | Methodological notes |
pages[] fields
| Field | Type | Notes |
|---|---|---|
pageUrl | string | Canonical page URL |
pageHandle | string|null | URL slug / handle |
pageId | string|null | Facebook numeric page ID |
name | string|null | Page display name |
category | string|null | Page category |
description | string|null | About / description text |
followerCount | integer|null | Follower count |
likeCount | integer|null | Like count |
websiteUrl | string|null | External website |
phone | string|null | Phone number (when public) |
email | string|null | Email address (when public) |
address | string|null | Physical address (when public) |
profileImageUrl | string|null | Profile image URL |
coverImageUrl | string|null | Cover image URL |
verifiedBadge | boolean|null | Page verification status |
createdDate | string|null | Page creation date |
extractedAt | string | Extraction timestamp |
status | string | ok / partial / degraded / error |
warnings | string[] | Page-specific warnings |
error | string|null | Error message if failed |
posts[] fields
| Field | Type | Notes |
|---|---|---|
postUrl | string|null | Canonical post URL |
postId | string|null | Facebook post/story ID |
pageHandle | string|null | Source page handle |
pageId | string|null | Source page ID |
text | string|null | Post body text |
publishedAt | string|null | ISO 8601 publish time |
mediaType | string|null | text, image, video, carousel, link |
imageUrls | string[] | Image URLs |
videoUrl | string|null | Video URL |
linkUrl | string|null | Shared link URL |
linkTitle | string|null | Shared link title |
reactionCount | integer|null | Total reactions |
commentCount | integer|null | Comment count |
shareCount | integer|null | Share count |
viewCount | integer|null | View count (video/reel) |
isSponsored | boolean|null | Whether marked as sponsored |
isPinned | boolean|null | Whether post is pinned |
hashtags | string[] | Hashtags found in text |
mentionedHandles | string[] | @mentions found in text |
extractedAt | string | Extraction timestamp |
Status values
| Status | Meaning |
|---|---|
ok | Full or near-full data extracted |
partial | Some fields extracted; others not publicly available |
degraded | JS challenge or login wall detected; extraction may be incomplete |
not_found | HTTP 404 — page does not exist |
blocked | HTTP 429 rate limit |
error | Network failure or unexpected error |
Warnings glossary
| Warning prefix | Trigger |
|---|---|
challenge-required | Facebook JS challenge or login wall detected |
pagination-not-supported | Additional posts exist beyond visible page |
field-not-public | Expected fields not available to logged-out visitors |
page-structure-changed | No extraction pattern matched — possible Facebook HTML change |
Pricing
This actor uses PAY_PER_EVENT pricing:
| Event | Price |
|---|---|
| Actor start (per GB memory, min 1) | $0.001 |
| Per result (Facebook page) | $0.005 |
A typical run collecting posts from 10 Facebook pages costs approximately $0.05–$0.06 (10 × $0.005 + start fee).
FAQ
Can this actor access private Facebook profiles or personal timelines? No. V1 is scoped exclusively to public Facebook Pages. Personal profiles are protected by Facebook's login wall.
Why are reaction/comment/share counts often null? Facebook hides engagement metrics from logged-out visitors on most pages. The actor extracts them when available but cannot guarantee their presence.
Can I scrape a Business Page, Celebrity Page, or Brand Page? Yes — these are public Facebook Pages and are in scope, subject to Facebook's access controls.
Does this use the Facebook Graph API? No. This actor uses public HTML only. No Facebook API credentials are required or used.
What happens if Facebook updates its page structure?
The actor uses tiered parsing (embedded JSON → meta tags → inline patterns) to be resilient to minor changes. If all tiers fail, a page-structure-changed warning is emitted. See the RUNBOOK for maintenance steps.
Can I run this at scale?
Yes, on Apify. Start with a small batch (5–10 pages) to verify extraction quality for your target pages before scaling. Use maxPostsPerPage to control output volume.
Support & limitations
- Public pages only — no private profiles, groups, or events
- V1 does not paginate — only posts visible in the initial page load are extracted
- Engagement metrics are often null for logged-out visitors
- Rate limiting — the actor politely spaces requests (2.5 s between pages) to reduce ban risk
- For issues, open a GitHub issue or contact the actor maintainer via Apify console
Related actors
- Meta Ad Library Intelligence — Facebook Ads (no login)
- Instagram Profile Intelligence — Public Instagram profiles
- Threads Profile Post Scraper — Public Threads profiles and posts