Back to all change logs

Apr 5, 2024

Performance improvements, API updates, and the new Adaptive Playwright Crawler

New

Update

Actor

API

Performance improvements

As part of our continuous performance improvement initiative, we're happy to announce that we successfully improved the Apify API response time by 50% on average and the 90th-percentile startup time of Actors by about 20%. We will continue improving Apify in this direction.

API updates

User limits endpoint now returns maxConcurrentActorJobs and activeActorJobCount properties enabling users to keep an eye on the concurrency limit.

We also added the missing endpoint /actor-builds/:build-id/log, allowing you to quickly access the log of certain builds without a need for an Actor run ID.

Adaptive Playwright Crawler

Try out Crawlee's new AdaptivePlaywrightCrawler class abstraction, which is an extension of PlaywrightCrawler that uses a more limited request handler interface so that it's able to switch to HTTP-only crawling when it detects that it may be possible. This way, you can achieve lower costs when crawling multiple websites.

1const crawler = new AdaptivePlaywrightCrawler({
2    renderingTypeDetectionRatio: 0.1,
3    async requestHandler({ querySelector, pushData, enqueueLinks, request, log }) {
4        // This function is called to extract data from a single web page
5        const $prices = await querySelector('span.price')
6
7        await pushData({
8            url: request.url,
9            price: $prices.filter(':contains("$")').first().text(),
10        })
11
12        await enqueueLinks({ selector: '.pagination a' })
13    },
14});
15
16await crawler.run([
17    'http://www.example.com/page-1',
18    'http://www.example.com/page-2',
19]);

Marek Trunkát

CTO