Actor picture

Facebook Pages Scraper

pocesar/facebook-pages-scraper

Facebook scraping tool to crawl and extract data from Facebook Pages. Our fully updated FB scraper downloads posts, likes, comments, reviews, contact details, social media profiles, address, and all public data from Facebook Pages. Download data as JSON, CSV, Excel, XML, and more.

Author's avatarPaulo Cesar
  • Modified
  • Used by4,802 users
  • Used367,454 times
Actor picture
Facebook Pages Scraper

Start urls

startUrls

Optional

array

Can provide either existing pages or business listing in the format https://www.facebook.com/biz/

Language

language

Optional

string

Provide the language. Using this setting changes the dataset output.

Options:

"af-ZA", "az-AZ", "bs-BA", "br-FR", "ca-ES", "cx-PH", "co-FR", "cs-CZ", "da-DK", "en-GB", "en-US", "eo-EO", "et-EE", "eu-ES", "tl-PH", "fo-FO", "fr-CA", "fr-FR", "fy-NL", "ff-NG", "gl-ES", "de-DE", "gn-PY", "ha-NG", "hr-HR", "id-ID", "ga-IE", "is-IS", "it-IT", "jv-ID", "rw-RW", "ht-HT", "ku-TR", "lv-LV", "lt-LT", "hu-HU", "mg-MG", "ms-MY", "mt-MT", "nl-NL", "nl-BE", "nb-NO", "nn-NO", "pl-PL", "pt-BR", "pt-PT", "ro-RO", "sc-IT", "sn-ZW", "sq-AL", "sz-PL", "sk-SK", "sl-SI", "so-SO", "es-LA", "es-ES", "fi-FI", "sw-KE", "sv-SE", "tr-TR", "uz-UZ", "vi-VN", "cy-GB", "zz-TR", "el-GR", "be-BY", "bg-BG", "ky-KG", "kk-KZ", "mk-MK", "mn-MN", "ru-RU", "sr-RS", "tt-RU", "tg-TJ", "uk-UA", "ka-GE", "hy-AM", "he-IL", "ur-PK", "ar-AR", "ps-AF", "fa-IR", "cb-IQ", "sy-SY", "tz-MA", "am-ET", "ne-NP", "mr-IN", "hi-IN", "as-IN", "bn-IN", "pa-IN", "gu-IN", "or-IN", "ta-IN", "te-IN", "kn-IN", "ml-IN", "si-LK", "th-TH", "lo-LA", "my-MM", "km-KH", "ko-KR", "zh-TW", "zh-CN", "zh-HK", "ja-JP", "ja-KS"

Search pages

searchPages

Optional

array

Search pages from the public directory

Max search results

searchLimit

Optional

integer

Limit the max number of results to return

Comments mode

commentsMode

Optional

string

Choose the way the comments are sorted

Options:

"RANKED_THREADED", "RECENT_ACTIVITY", "RANKED_UNFILTERED"

Max posts

maxPosts

Optional

integer

Limit the max number of posts to return

Minimum post count

minPosts

Optional

integer

What is the expected minimum number of posts before considering it a successful run

Max post date

maxPostDate

Optional

string

Posts are retrieved from newest to oldest. Use this setting to limit how many posts in the past it should fetch.

Min post date

minPostDate

Optional

string

Set the minimum post date, remembering that posts start from newest to oldest, it will skip the newest posts up to the specified date

Max comments

maxPostComments

Optional

integer

Limit the max comments per post to return

Minimum post comments

minPostComments

Optional

integer

What is the expected minimum number of comments in each post before considering it a successful run

Minimum comment date

maxCommentDate

Optional

string

Limit the minimum date for comments

Max reviews

maxReviews

Optional

integer

Limit the max number of reviews to return

Max review date

maxReviewDate

Optional

string

Limit the date of the reviews

About

scrapeAbout

Optional

boolean

Get the About page if exists

Reviews

scrapeReviews

Optional

boolean

Get the Reviews if it exists

Posts

scrapePosts

Optional

boolean

Get the Posts

Services

scrapeServices

Optional

boolean

Get the Services if it exists

Proxy configuration

proxyConfiguration

Required

object

If you don't have access to RESIDENTIAL proxy group, contact us at support@apify.com to get proxy trial.

Extend output function

extendOutputFunction

Optional

string

Extend the output item to contain more fields. The raw data is present in the 'data' variable.

Extend Scraper Function

extendScraperFunction

Optional

string

Advanced function that allows you to extend the default scraper functionality, allowing you to manually perform actions on the page

Custom data

customData

Optional

object

Any data that you want to have available inside the Extend Output/Scraper Function

Session storage

sessionStorage

Optional

string

Provide a named session storage. Should be used in conjunction with login session actor. The same sessionStorage name should be supplied here. Don't use this unless you have a task that does Facebook logins using login-session actor

Use stealth

useStealth

Optional

boolean

Enable stealth on headless Chrome. Enable this only if you're having problems with blocking.

Debug Log

debugLog

Optional

boolean

Enable a more verbose logging to be able to understand what's happening during the scraping