User picture

Jan Čurn

jancurn

Founder and CEO of Apify. I still enjoy coding.

ACTOR METRICS

13 public Actors

117 monthly users

98.6% runs succeeded

39 days response time

Hello world 👋🏻

This is a test.

Public Actors

Broken Link Checker avatar

Broken Link Checker

jancurn/find-broken-links

Crawls a website and finds broken links. Unlike other similar SEO analysis tools, the actor also reports broken URL #fragments. The results are stored in a JSON and HTML report.

User avatar

Jan Čurn

548

Metadata Extractor avatar

Metadata Extractor

jancurn/extract-metadata

A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.

User avatar

Jan Čurn

1.1k

Residential Proxy Probe avatar

Residential Proxy Probe

jancurn/residential-proxy-probe

Find residential proxy sessions on Apify Proxy with target IP addresses geo-located in specific postal codes or DMAs.

User avatar

Jan Čurn

538

Screenshot Taker avatar

Screenshot Taker

jancurn/screenshot-taker

Takes a screenshot of one or more web pages using the Chrome browser. The actor enables the setting of custom viewport size, page load timeout, delay, proxies, and output image format.

User avatar

Jan Čurn

432

PDF to HTML Converter avatar

PDF to HTML Converter

jancurn/pdf-to-html

Converts a PDF document to HTML using the pdf2htmlEX tool.

User avatar

Jan Čurn

415

HTML to PDF Converter avatar

HTML to PDF Converter

jancurn/url-to-pdf

Loads a web page in headless Chrome using Puppeteer and prints it to PDF. The input is a JSON object and output is a PDF file.

User avatar

Jan Čurn

337

Naked Domains Analyzer avatar

Naked Domains Analyzer

jancurn/analyze-domains

Crawls and downloads web pages running on a list of provided naked domains e.g. "example.com". The actor stores HTML snapshot, screenshot, text body, and HTTP response headers of all the pages. It also extracts email addresses, phones, social handles for Facebook, Twitter, LinkedIn, and Instagram.

User avatar

Jan Čurn

302

Algolia Webcrawler avatar

Algolia Webcrawler

jancurn/algolia-webcrawler

Crawls a website using one or more sitemaps and imports the data to Algolia search index. The text content is identified using simple CSS selectors.

User avatar

Jan Čurn

67

Selenium Custom Firefox POC avatar

Selenium Custom Firefox POC

jancurn/selenium-custom-firefox

Uses Selenium to run custom build of Firefox that is harder to detect. The actor just saves a screenshot and snapshot of the HTML.

User avatar

Jan Čurn

52

PP

Probe Page Resources

jancurn/probe-page-resources

Sequentially loads a list of URLs in headless Chrome and analyzes HTTP resources requested by each page. Source code at https://github.com/jancurn/act-probe-page-resources

User avatar

Jan Čurn

29

Example Sitemap Cheerio avatar

Example Sitemap Cheerio

jancurn/example-sitemap-cheerio

An example actor that first downloads a sitemap in XML format and the crawls each page from the sitemap using the fast CheerioCrawler from Apify SDK.

User avatar

Jan Čurn

24

Public Actors Lister avatar

Public Actors Lister

jancurn/public-actors-fetcher

Downloads a list of all Actors published in Apify Store, with all properties such as URL, title, description, etc. This is useful to create a knowledge file for a GPT, so that it knows which Actors can it use.

User avatar

Jan Čurn

8

PR

Probe Resources Plus Webhook

jancurn/probe-resources-plus-webhook

Calls jancurn/probe-page-resources and then invokes a hard-coded webhook. The act takes same input as jancurn/probe-page-resources

User avatar

Jan Čurn

3