Keywords Extractor avatar
Keywords Extractor
Try for free

No credit card required

View all Actors
Keywords Extractor

Keywords Extractor

lukaskrivka/keywords-extractor
Try for free

No credit card required

Use our free website keyword extractor to crawl any website and extract keyword counts on each page.

Keyword Extractor

Can deeply crawl a website and counts how many times are provided keywords found on the page.

How to use

  • You can pass in any number of keywords that you want to count.
  • You can combine Start URLs, Pseudo Urls and link selector to traverse any number of pages accross websites. Check our scraping tutorial on how to use these.
  • You can specify maxDepth and maxPagesPerCrawl to limit the scope of the scrape. Start URLs have depth 0. So if you want just the start URLs, set maxDepth to 0, etc.
  • You can pick case sensitive search and search through scripts.
  • You can choose to scrape with or without browser. Browser is more expensive but allows JavaScript rendering and waiting.
  • For browser, you can use many additional features

How are keywords determined

The text is split into words by word boundaries. Each word is then compared with each keyword. In the future, we may add other types of boundaries to choose from.

Example Output

For keywords:

["watch", "watches", "rolex"]

starting on https://www.chrono24.com/watches/mens-watches--62.htm

1[
2    {
3        "url": "https://www.chrono24.com/watches/mens-watches--62.htm",
4        "depth": 0,
5        "result": {
6            "watch": 63,
7            "watches": 81,
8            "rolex": 57
9        }
10    },
11    {
12        "url": "https://www.chrono24.com/user/index.htm",
13        "depth": 1,
14        "result": {
15            "watch": 9,
16            "watches": 13,
17            "rolex": 1
18        }
19    },
20    {
21        "url": "https://www.chrono24.com/info/watch-collection.htm",
22        "depth": 1,
23        "result": {
24            "watch": 56,
25            "watches": 23,
26            "rolex": 1
27        }
28    },
29...
30]
Developer
Maintained by Community
Actor metrics
  • 26 monthly users
  • 99.6% runs succeeded
  • 0.0 days response time
  • Created in Mar 2020
  • Modified about 3 years ago
Categories