HTML to PDF Converter
Loads a web page in headless Chrome using Puppeteer and prints it to PDF. The input is a JSON object and output is a PDF...
Crawls a website using one or more sitemaps and imports the data to Algolia search index. The text content is identified...
PDF to HTML Converter
Converts a PDF document to HTML using the pdf2htmlEX tool.
Broken Links Checker
Crawls a website and finds broken links. Unlike other similar SEO analysis tools, the actor also reports broken URL #fra...
Takes a screenshot of one or more web pages using Chrome browser. The actor enables the setting of custom viewport size,...
A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-dat...
Naked Domains Analyzer
Crawls and downloads web pages running on a list of provided naked domains (e.g. "example.com"). The actor stores a HTML...
Probe Page Resources
Sequentially loads a list of URLs in headless Chrome and analyzes HTTP resources requested by each page. Source code at ...
Send Email On Crawler Finish
Fetches information about a crawler run and sends it to the user by email. For example, this actor can be used to inform...
Webpage DOM & CSS Analyzer
Example showing how to use headless Chromium with Puppeteer to open a web page, fetch the list of DOM nodes on the pages...
Residential Proxy Probe
Find residential proxy sessions on Apify Proxy with target IP addresses geo-located in specific postal codes or DMAs.
Selenium Custom Firefox POC
Uses Selenium to run custom build of Firefox that is harder to detect. The actor just saves a screenshot and snapshot of...
Czech President Election
Collects voting data from the Czech statistical office about the Czech presidential election of 2018.
Example Sitemap Cheerio
An example actor that first downloads a sitemap in XML format and the crawls each page from the sitemap using the fast C...