Algolia Website Indexer avatar
Algolia Website Indexer
Try for free

No credit card required

View all Actors
Algolia Website Indexer

Algolia Website Indexer

Try for free

No credit card required

The Indexer crawls recursively a website using the Puppeteer browser (headless Chrome) and indexes the selected pages to the Algolia index.

The Indexer crawls a website using the Puppeteer browser (headless Chrome) and indexes the selected pages to the Algolia index. It was designed to run in an Apify actor.


You can find instructions on how to run it in the Apify cloud on its Apify Store page. If you want to run it in your environment, you can use the Apify CLI.


The input of the actor is JSON with the following parameters.

algoliaAppIdStringYour Algolia Application ID
algoliaApiKeyStringYour Algolia API key
algoliaIndexNameStringYour Algolia index name
crawlerNameStringCrawler name, it updates/removes/adds pages into the index regarding this name. In this case, you can have more websites in the index.
startUrlsArrayURLs where crawler starts crawling
selectorsArraySelectors, which text content you want to index. Key is name of the attribute and value is the CSS selector.
waitForElementStringSelector of an element to wait on each page.
additionalPageAttrsObjectAdditional attributes you want to attach to each record in the index.
skipIndexUpdateBooleanOption to switch off updating the Algolia index.


There are a few parameters not shown in the UI. These parameters change the behaviour of crawling, and you can set them up using the API or in the local environment.

pageFunctionStringOverrides default pageFunction
pseudoUrlsArrayOverrides default pseudoUrls
clickableElementsStringOverrides default clickableElements
keepUrlFragmentBooleanOption to switch on enqueueing URL with URL fragments
omitSearchParamsFromUrlBooleanOption to switch off enqueueing with search params.

Debug indexed pages

You can find all the pages that will be indexed in the default dataset for a specific actor run.

Maintained by Apify
Actor metrics
  • 1 monthly users
  • 100.0% runs succeeded
  • 0.0 days response time
  • Created in Jul 2019
  • Modified 3 months ago