Under maintenance

$199.00/month

Go to Store

This Actor is under maintenance.

This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?

See alternative Actors

LinkedIn Profiles Bulk Scraper Premium

rockapi-group/linkedin-profiles-bulk-scraper-premium

Try Actor

$199.00/month

TypeScript PuppeteerCrawler Actor template

This template is a production ready boilerplate for developing with PuppeteerCrawler. The PuppeteerCrawler provides a simple framework for parallel crawling of web pages using headless Chrome with Puppeteer. Since PuppeteerCrawler uses headless Chrome to download web pages and extract data, it is useful for crawling of websites that require to execute JavaScript.

If you're looking for examples or want to learn more visit:

Included features

Puppeteer Crawler - simple framework for parallel crawling of web pages using headless Chrome with Puppeteer
Configurable Proxy - tool for working around IP blocking
Input schema - define and easily validate a schema for your Actor's input
Dataset - store structured data where each object stored has the same attributes
Apify SDK - toolkit for building Actors

How it works

Actor.getInput() gets the input from INPUT.json where the start urls are defined
Create a configuration for proxy servers to be used during the crawling with Actor.createProxyConfiguration() to work around IP blocking. Use Apify Proxy or your own Proxy URLs provided and rotated according to the configuration. You can read more about proxy configuration here.
Create an instance of Crawlee's Puppeteer Crawler with new PuppeteerCrawler(). You can pass options to the crawler constructor as:
- proxyConfiguration - provide the proxy configuration to the crawler
- requestHandler - handle each request with custom router defined in the routes.js file.

Handle requests with the custom router from routes.js file. Read more about custom routing for the Cheerio Crawler here

Create a new router instance with new createPuppeteerRouter()
Define default handler that will be called for all URLs that are not handled by other handlers by adding router.addDefaultHandler(() => { ... })

Define additional handlers - here you can add your own handling of the page

1router.addHandler('detail', async ({ request, page, log }) => {
2    const title = await page.title();
3    // You can add your own page handling here
4
5    await Dataset.pushData({
6        url: request.loadedUrl,
7        title,
8    });
9});

crawler.run(startUrls); start the crawler and wait for its finish

Resources

If you're looking for examples or want to learn more visit:

Crawlee + Apify Platform guide
Documentation and examples
Node.js tutorials in Academy
How to scale Puppeteer and Playwright
Video guide on getting data using Apify API
Integration with Make, GitHub, Zapier, Google Drive, and other apps
A short guide on how to build web scrapers using code templates:

Getting started

For complete information see this article. In short, you will:

Build the Actor
Run the Actor

Pull the Actor for local development

If you would like to develop locally, you can pull the existing Actor from Apify console using Apify CLI:

Install apify-cli

Using Homebrew

brew install apify-cli

Using NPM

npm -g install apify-cli

Pull the Actor by its unique <ActorId>, which is one of the following:
- unique name of the Actor to pull (e.g. "apify/hello-world")
- or ID of the Actor to pull (e.g. "E2jjCZBezvAZnX8Rb")
You can find both by clicking on the Actor title at the top of the page, which will open a modal containing both Actor unique name and Actor ID.

This command will copy the Actor into the current directory on your local machine.
```
apify pull <ActorId>
```

Documentation reference

To learn more about Apify and Actors, take a look at the following resources:

Developer

RockAPI Group

Actor Metrics

1 monthly user
1 star
>99% runs succeeded
Created in Jan 2024
Modified 3 months ago

Categories

Lead generation

Other

Automation

🔥 Linkedin Companies & Profiles Bulk Scraper

bebity/linkedin-premium-actor

Companies & Profiles Linkedin scraper. Get comprehensive profiles of individuals and companies based on your keywords and filters. Unleash the power of data! 🌐🔍

Bebity

4.3k

📩📍 Google Maps Email Extractor

lukaskrivka/google-maps-with-contact-details

Extract Google Maps contact details. Scrape websites of Google Maps places for contact details and get email addresses, website, location, address, zipcode, phone number, social media links. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Lukáš Křivka

9.3k

258

Facebook Posts Scraper

apify/facebook-posts-scraper

Extract data from hundreds of Facebook posts from one or multiple Facebook pages and profiles. Get post URL, post text, page or profile URL, timestamp, number of likes, shares, comments, and more. Download the data in JSON, CSV, and Excel and use it in apps, spreadsheets, and reports.

Apify

16.1k

221

Instagram Scraper

apify/instagram-scraper

Scrape and download Instagram posts, profiles, places, hashtags, photos, and comments. Get data from Instagram using one or more Instagram URLs or search queries. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify

63.9k

656

Contact Details Scraper

vdrmota/contact-info-scraper

Free email extractor and lead scraper to extract and download emails, phone numbers, Facebook, Twitter, LinkedIn, and Instagram profiles from any website. Extract contact information at scale from lists of URLs and download the data as Excel, CSV, JSON, HTML, and XML.

Vojta Drmota

25.9k

250

Google Maps Extractor

compass/google-maps-extractor

Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

23k

519

Facebook Comments Scraper

apify/facebook-comments-scraper

Extract data from hundreds of Facebook comments from one or multiple Facebook posts. Get comment text, timestamp, likes count and basic commenter info. Download the data in JSON, CSV, Excel and use it in apps, spreadsheets, and reports.

Apify

7.3k

🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper

apidojo/tweet-scraper

⚡️ Lightning-fast search, URL, list, and profile scraping, with customizable filters. At $0.30 per 1000 tweets, and 30-80 tweets per second, it is ideal for researchers, entrepreneurs, and businesses! Get comprehensive insights from Twitter (X) now!

API Dojo

9.5k

387

Facebook Groups Scraper

apify/facebook-groups-scraper

Extract data from one or multiple public Facebook groups. Get group and post URLs, post text, comments, timestamp, likes and comments count, and basic commentator info. Download the data in JSON, CSV, and Excel and use it in apps, spreadsheets, and reports.

Apify

7.6k

Instagram Hashtag Scraper

apify/instagram-hashtag-scraper

Scrape Instagram hashtags data. Just add one or more hashtags and extract posts, images, URLs, comments, likes, users, locations, timestamps, and more. Export scraped datasets, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify

19k

110