Actor picture

Facebook Latest Comments Scraper

pocesar/facebook-latest-comments-scraper

Mini-actor to scrape Facebook comments from one or multiple post URLs. Get comment text, timestamps, author id, author name, etc. Download data in structured formats such as JSON, CSV, XML, Excel, and HTML and use it in apps, spreadsheets, and reports. ​

No credit card required

Author's avatarPaulo Cesar
  • Modified
  • Users243
  • Runs238,225
Actor picture
Facebook Latest Comments Scraper

.editorconfig

root = true

[*]
indent_style = space
indent_size = 4
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
end_of_line = lf

.eslintrc

{
    "extends": "@apify"
}

.gitignore

# This file tells Git which files shouldn't be added to source control

.idea
node_modules

Dockerfile

# First, specify the base Docker image. You can read more about
# the available images at https://sdk.apify.com/docs/guides/docker-images
# You can also use any other image from Docker Hub.
FROM apify/actor-node:16

# Second, copy just package.json and package-lock.json since it should be
# the only file that affects "npm install" in the next step, to speed up the build
COPY package*.json ./

# Install NPM packages, skip optional and development dependencies to
# keep the image small. Avoid logging too much and print the dependency
# tree for debugging
RUN npm --quiet set progress=false \
 && npm install --only=prod --no-optional \
 && echo "Installed NPM packages:" \
 && (npm list --only=prod --no-optional --all || true) \
 && echo "Node.js version:" \
 && node --version \
 && echo "NPM version:" \
 && npm --version

# Next, copy the remaining files and directories with the source code.
# Since we do this after NPM install, quick build will be really fast
# for most source file changes.
COPY . ./

# Optionally, specify how to launch the source code of your actor.
# By default, Apify's base Docker images define the CMD instruction
# that runs the Node.js source code using the command specified
# in the "scripts.start" section of the package.json file.
# In short, the instruction looks something like this:
#
# CMD npm start

INPUT_SCHEMA.json

{
    "title": "Input schema for the apify_project actor.",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "startUrls": {
            "title": "Posts URL(s)",
            "type": "array",
            "description": "Insert pages that contains direct posts like https://www.facebook.com/apifytech/posts/303001308495203",
            "editor": "requestListSources",
            "prefill": [ 
                {
                    "url": "https://www.facebook.com/apifytech/posts/306435811485086"
                }
            ]
        }
    },
    "required": ["startUrls"]
}

README.md

This file is 114 lines long. Only the first 50 are shown. Show all

## Why scrape Facebook comments? 

Our free Facebook Comments Scraper allows to scrape Facebook posts and their comment section. Scraping Facebook comments can help you track public sentiment towards specific topics, brands, and personas, get immediate insights into the performance of marketing campaigns, identify and react to fake news, abuse, or information of high public value, and more.

If you want to know how your business could use the comment data scraped from Facebook, check out our  [industries pages](https://apify.com/industries)  for ideas and use cases.

## How to scrape Facebook comments

Facebook Comments Scraper has a user-friendly structure, so there aren't too many scraping parameters to set.

1.  [Create](https://console.apify.com/sign-up)  a free Apify account.
2.  Open  [Facebook Latest Comments Scraper](https://apify.com/pocesar/facebook-latest-comments-scraper)
3.  Add one or more Facebook post URLs to scrape the comments under them.
4.  Click *Run* and wait for the datasets to be extracted.
5.  Download your data in JSON, XML, CSV, Excel, or HTML.

## Want to scrape the most recent Facebook posts instead?

If you need to scrape not just a post but a whole Facebook page for recent posts and comments, consider trying this mini-scraper's twin - [Facebook Posts Scraper](https://apify.com/pocesar/facebook-latest-posts-scraper). Our mini-scrapers are created to require just 1 or 2 settings to deliver scraping results quickly and effortlessly. Just enter one or more post URLs and click to start the scraping. 

## Need something more advanced?

Try our more advanced  [Facebook Pages Scraper](https://apify.com/pocesar/facebook-pages-scraper) or [Facebook Ads Scraper](https://apify.com/tugkan/facebookads-scraper) if you need a wider array of options and are comfortable with configuring various scraper settings on your own.

Let us know if you need a  [custom Facebook scraping solution](https://apify.com/custom-solutions).

##   Cost of usage

Based on Apify's pricing at the time of writing, the Personal plan **($49)** would allow you to  **scrape comments from about 10-20k posts monthly**. 

To run this actor, you will need to have access to [residential proxies](https://apify.com/proxy). If you don't already have access, contact us at [support@apify.com](mailto:support@apify.com).

For more info on how the price for scraping Facebook is formed, read in [Cost of usage](https://apify.com/pocesar/facebook-pages-scraper#cost-of-usage) of the main Facebook Pages Scraper.



## Input

Here's an example input for scraping comments from a Facebook post with a picture. It needs only one parameter - **startUrls**. You can check the `INPUT_SCHEMA` tab for more details.

```javascript
{  
	 "startUrls":  [
	   {  
		    "url":  "https://www.facebook.com/time/photos/a.470156966490/10158856011396491"  
	   }  
	]  
}
```

apify.json

{
    "env": { "npm_config_loglevel": "silent" }
}

main.js

This file is 143 lines long. Only the first 50 are shown. Show all

// This is the main Node.js source code file of your actor.

// Import Apify SDK. For more information, see https://sdk.apify.com/
const Apify = require('apify');

const pageFunction = async (context) => {
    const { $, request, log, body, customData } = context;
    const { includeLastComments = false } = customData;
    const { label } = request.userData;

    const lds = $('script[type*="ld"]')
        .map((_, el) => JSON.parse($(el).html()))
        .get()
        .filter(Boolean);

    if (!lds.length) {
        throw new Error('Page didnt load properly, retrying');
    }

    const interactions = (interactionStatistic, type) => {
        return interactionStatistic
                .find(({ interactionType }) => interactionType.includes(type))
                ?.userInteractionCount ?? null;
    }

    const toISODate = (date) => {
        if (!date) {
            return null;
        }

        try {
            return new Date(date).toISOString();
        } catch (e) {
            return null;
        }
    }

    const addComments = (comments) => {
        if (!includeLastComments || !comments?.length) {
            return [];
        }

        return comments.map((comment) => {
            return {
                authorId: comment.author?.identifier ?? null,
                authorProfile: comment.author?.url ?? null,
                author: comment.author?.name ?? null,
                dateCreated: toISODate(comment.dateCreated),
                text: comment.text,
            };

package.json

{
    "name": "project-empty",
    "version": "0.0.1",
    "description": "This is a boilerplate of an Apify actor.",
    "dependencies": {
        "apify": "^2.1.0"
    },
    "devDependencies": {
        "@apify/eslint-config": "^0.2.2",
        "eslint": "^7.32.0"
    },
    "scripts": {
        "start": "node main.js",
        "lint": "./node_modules/.bin/eslint ./src --ext .js,.jsx",
        "lint:fix": "./node_modules/.bin/eslint ./src --ext .js,.jsx --fix",
        "test": "echo \"Error: oops, the actor has no tests yet, sad!\" && exit 1"
    },
    "author": "It's not you it's me",
    "license": "ISC"
}