apify_animation_01_02

Change log

Stay up-to-date with what's new on Apify

July 2019

  • Wev

    Apify website apify.com has been completely redesigned.

  • Web

    Page analyzer improved. Check it out!.

  • App

    Finished (failed) actor run can be now resurrected back to a RUNNING state. Checkout out the documentation for more information.

  • App

    Scheduler now displays log of both successfull and failed invocations.

  • General

    Starting on Wednesday 31st July 2019, user accounts with disposable (temporary) email addresses will no longer be able to run actors, use Apify Proxy or call Apify API. To retain your access to Apify services, please change your email address on the Account - Settings page to a legitimate email provider.

  • Actors

    Actor and Task inputs specified through INPUT_SCHEMA.json can now be split into collapsible sections. Find out how to configure them here.

June 2019

  • Crawler

    Apify Crawler is being phased out. Please read this blog post to find out why we are retiring the Crawler product, what it means for you and how you can migrate your crawlers to a new actor, including the integrations.

  • API

    Actor task input can be now retrieved and modified using a special API endpoint.

  • App

    Users can now upload custom profile picture in account settings.

  • App

    Users can upload image to published actor. This image will soon appear at its public page in Apify Store.

  • Actors

    Startup times of actors were optimized using CPU boost during the first 10 seconds of run.

  • Actors

    Actor run along with its data can be now shared using public link that is available under the tab "info".

  • Actors

    Tasks now support only JSON encoded input. This also affects API which returns actor task input directly as object under the input property instead of JSON-encoded pair of body and contentType. See API documentation of a get actor task endpoint.

  • General

    Apify Forum was moved to Stack Overflow.

May 2019

  • Actors

    Actor task input can be overloaded in scheduler.

  • Actors

    Added limit of 300 characters for description.

  • Actors

    New Dockerfile templates for multifile allow faster builds.

  • API

    Rate limit for dataset push items endpoint increased to 300 req/s per store.

  • API

    Added actor author username to list actor tasks.

  • API

    Added input schema to build detail.

  • Scheduler

    Schedules that use a predefined CRON expression such as @monthly, @weekly, @daily or @hourly randomly change the base times to ensure that schedules with the same expression will not all start at the same time. This measure is aimed at improving startup times and the performance of your actors and crawlers.

  • Actors

    Tasks can be now easily published as actor. Check out knowledge base article to learn more.

  • Webhooks

    Request payload can be now modified in webhook configuration. Check out webhooks documentation to learn more.

  • Webhooks

    Ad hoc webhooks now support idempotency key to ensure that duplicate webhooks won't get created when actor gets restarted. Check out webhooks documentation to learn more.

  • Actors

    Web server running in actor is not required to start in 120sbut can start at any time during the lifespan of its container.

  • Actors

    Git deployment key is now available via API (get actor endpoint).

  • Actors

    "Use spare CPU capacity" configuration was removed.

April 2019

March 2019

  • Actors

    Increased maximum memory for actor runs to 32 GB.

  • Actors

    Input UI for actor now validates proxy configuration.

  • API

    Added set of API endpoints to manage webhooks and retrieve webhook dispatches.

  • Actors

    New validation options added to actor input schema field definitions. For example max and min length of string and array field or regular expression pattern for values of string list field.

  • Proxy

    URL of Apify proxy now supports new parameter country that restricts proxy IPs selection to given country.

  • Actors

    Run can now metamorph into run of another actor.

  • General

    Original Apify crawler has been open sourced as actor apify/legacy-phantomjs-crawler. This actor has the same input as original Apify crawler and also the same output format.

February 2019

  • API

    New set of API endpoints to retrieve and manage the last actor (task) run and its default storages. Check API documentation for more information.

  • Actors

    Source code editor extended with multifile support, more in documentation.

  • Actors

    Runs with READY and RUNNING state are now pinned to the top of the actor runs list.

  • Actors

    New input UI fields added (key-value pairs, string list, hidden fields). All the field types now support nullable option. See documentation page for more information.

  • Actors

    Improved actor publication page.

January 2019

  • Actors

    New webhooks component enables integration of actors with external services and orchestration of multiple actors into single pipeline.

  • Actors

    Run console was improved and provides quick overview of actor run storages.

  • Actors

    Published actors have new title that is displayed at its public library page.

  • Dataset

    Added support for hidden fields (i.e. fields starting with the # character). These fields may be used to store debug information such as errors, response codes, etc. that might be easily omitted from output.

  • Dataset

    Dataset: Added new parameters to API endpoints returning dataset items - skipEmpty=true to omit empty items, skipHidden=1 to omit hidden fields and clean=true a shortcut for skipEmpty and skipHidden.

  • API

    All endpoints with [username]~[resourceName] parameter in URL now support also [userId]~[resourceName] format.

  • App

    Code editor used at Apify app was replaced with modern Monaco editor that supports all ES6 features.

  • Actors

    Memory limit for free accounts increased to 8GB.

  • Actors

    Input UI for request list now supports web hosted or uploaded file with a list of URLs. Try out Crawler - cheerio to see it in action (Start URLs field).

  • Actors

    Publicated actor can be now marked as deprecated. Deprecated actor will be omitted from public library search and flagged as deprecated. Use this feature to tell people your actor is no longer being developed, since removing it might break integrations that depend on the actor.

  • App

    Replaced code editor with Monaco Editor which supports modern JavaScript features and provides better coding experience.

December 2018

  • API

    Removed the meta.clientIp field from several API endpoints due to privacy concerns

  • Actors

    Updated base Apify Docker images to use CMD rather than ENTRYPOINT instruction to launch the code. If you're using a custom Dockerfile that is based on Apify base images, make sure your CMD instruction is correct. See Dockerfile example for more information.

  • Web

    Added featured actors and crawlers to library. Added input schema and example run to actor detail page.

  • App

    Added new section with third-party login services to Account page

  • General

    Numerous performance and stability improvements, bugfixes

November 2018

  • App

    Dataset detail page now shows preview of the data.

  • CLI

    Added new commands to manage secrets environment variables, check apify secrets help for more details.

  • CLI

    Simplified apify.json file structure. It will be updated automatically before execution apify run and apify push command. Read more in the documentation.

  • App

    Added new Orders section to enable customers to keep track of their custom projects. Read more in a blog post.

  • App

    A large number of user interface and performance improvements.

  • App

    Now you can set an additional billing email address that will receive copy of all invoices. To set it, just go to your Subscription page, click Edit, set Billing email and click Update subscription.

  • API

    Apify Storage API endpoints (i.e. key-value stores, datasets, and request queues) that use other than GET HTTP method are now authorized using API token of user. Please see API documentation for more information. Note that we made a special exception in the system that will ensure that affected users will be able to continue using the API the old way. We'll send additional information to these users.

  • API

    New endpoints providing access to particular version of actor added.

  • API

    Actor task input can be now overloaded via API. See documentation for more information.

October 2018

  • Actors

    Private Git repositories are now supported. Check documentation for more information.

  • Actors

    Improved actor UI - run console and source page has been redesigned for better developer experience.

  • Web

    Improved search in library.

  • Web

    A new page with awesome case studies was published.

  • Web

    Actors and crawlers in library are now organized by categories.

  • Actors

    "Is exclusive" functionality of scheduler now supports actor also. If this options is checked then scheduler won't start another run as long as previous is still running.

  • SDK

    New documentation for Apify SDK is now available at https://sdk.apify.com.

  • Actors

    Input of an actor and its input UI can be now described in input schema.

  • Actors

    Many new public actors with UI for input released in library: petr_cermak/booking-hotels, lukaskrivka/google-spreadsheet, jakubbalada/content-checker, apify/image-diff, ... checkout library for more.

  • Tasks

    Released Apify actor tasks. Using them, you can create multiple configurations of a single actor and then run the selected configuration directly from Apify Platform, schedule or API.

  • Proxy

    New documentation of Apify proxy released. Contains examples in multiple languages and detailed description of all provided proxies - datacenter, residential, and Google SERP.

September 2018

  • SDK

    Released new major version v0.7 of apify NPM package. Check changelog for more information.

  • CLI

    Changed behaviour of apify run command and apify local storage directory name. Check migration guide if you are updating from version v0.1.*.

August 2018

  • Actors

    Added actor live view that enables connecting to running containers - read more on Apify Blog

  • App

    Major internal code consolidation and performance improvements

  • API

    Various bugfixes and improvements in code and documentation

  • Proxy

    Improvements in Google SERP proxies, adding additional providers

  • Integrations

    Added support for input file from other steps.

July 2018

  • Actors

    Memory option for actor runs now supports only values that are power of 2 (ie. 128MB, 256MB, 512MB, 1024MB, 2048MB, ...)!

  • Crawler

    Proxy configuration of crawler now offers "automatic" mode that rotates all the proxies available for a user.

  • Actors

    Each actor run can now start a web server accessible at a certain unique URL. This enables you to run a web server inside the actor to provide real-time snapshots or receive tasks on the fly. See documentation for more details.

June 2018

  • API

    Added API endpoints to abort Actor run and build.

  • Proxy

    New Apify Proxy service launched!

  • Integrations

    Added support for running actors in Keboola integration. Check knowledge article for more information.

  • Actors

    Minimum memory for actor runs is now 128MB.

  • CLI

    Added log streaming for apify push and apify call commands.

  • CLI

    Added parameter to clean stores before runs actor locally. Check doc for more information.

May 2018

  • SDK

    Bunch of improvements and new features. Check the changelog.

  • Crawler

    Now it is not possible to combine custom proxies and Apify proxy groups.

  • Actors

    Run console now shows information about current/max/avegare CPU and memory.

  • Actors

    Actors are now notified 120s before migration to another worker machine. Check documentation for more information.

April 2018

  • API

    Added a new API end-point to obtain information about a user account

  • API

    Storage API now also supports use of [username]~[storage-name] instead of Dataset ID and Key-value store ID.

  • CLI

    We have just released an Apify CLI (command line tool) to simplify local development, debugging and deployment to Apify.

  • Storage

    New storage type for Actor platform that helps to manage dynamic queue of URLs to be processed. Check storage documentation for more information.

  • SDK

    apify NPM package contains a lot of new features. Check its changelog for details.

  • Actors

    limit for number of processes per actor run was increased to 2 x [memory megabytes] so with 2 GB memory your limit is 4000 processes.

  • Actors

    host machine now sends migrating event to actor process in a case of upcoming restart or shutdown. Check documentation.

March 2018

  • Actors

    actor runs have now fixed amount of CPU capacity reserved and therefore each run should take about the same time. We also added a new checkbox "Use spare CPU capacity" in actor settings allowing actors to use spare CPU capacity at host machine as free boost.

  • SDK

    We released a new version of our open souce apify npm package containing a lot of new stuff to help you with your web scraping and automation projects. Check its npm page, source code at GitHub repository and the documentation.

  • Actors

    apify/actor-node-puppeteer Docker image is now deprecated. Use apify/actor-node-chrome image instead.

  • Actors

    We have added apify/actor-node-chrome-xvfb image that supports non-headless Chrome. If you choose this image then Apify.launchPuppeteer() opens Puppeteer with non-headless Chrome by default.

  • API client for Javascript v0.2.0

    Method datasets.getItems() now returns object PaginationList with items wrapped inside instead of plain items array. This helps to iterate through all the items using pagination. This change is not backward compatible!

  • Actors

    We did improvements of our infrastructure to improve actor starts and overall performance.

  • Actors

    Logs are now rate-limited. Each actor run and build has 10 000 lines log credit with 10 lines added each second. Log lines over the limit won't be available in both UI and API.

February 2018

  • Web

    Launched Page Analyzer tool to enable setting up crawlers with less manual steps. Read more on Apify blog.

  • Internal

    Major improvements to our Linux server configuration to improve stability and performance of the system

  • Actors

    Actors can now run with 16GB memory (available for users with Medium and large plans see https://apify.com/docs/actor#limits

  • Actors

    Actor runs and their default key-value stores and datasets are now being deleted after data retention period.

January 2018

  • App

    We've added support for PayPal payments for all subscription plans

  • Actors

    The actor source code can now come from a GitHub Gist, which is much simpler than having a full Git repository (read the docs)

  • Support

    We have re-launched the Knowledge base with a new design and much better search options.

  • API

    Added API endpoint to run an actor and get its output in a single HTTP request.

  • Actors

    We've added a new storage type Dataset. This enables you to store results in a way similar to Apify Crawler.

  • Actors

    Actor usage statistics are now available in user account.

December 2017

  • Community

  • Actors

    Smarter allocation of tasks to servers to improve performance

  • Actors

    Environment variables can now also be passed to actor builds (as docker --build-arg parameter)

  • Actors

    Added option to automatically restart actor runs on error

  • Crawler

    Fixed URL in the link element of RSS formatted last crawler execution result. This bug was causing that some RSS readers never refreshed the data

November 2017

  • Crawler

  • Proxy

    Released a new NPM package called proxy-chain to support usage of proxies with password from headless Chrome

  • API

    Added support for XLSX output Format for crawler results

  • App

    Upgraded the web app to Meteor 1.6 and thus greatly improved the speed of the app

  • Internal

    Improved internal notifications, performance and infrastructure improvements

  • Actors

    Added feature to enable actor to be anonymously runnable

October 2017

  • General

    Apifier is dead, long live Apify! On 9th October we launched our biggest upgrade yet.

  • Web

    The old website at www.apifier.com was replaced with public static website www.apify.com and the app running at my.apify.com

  • Actors

    A new product called Actor was introduced. Read more in our blog

  • Actors

    Added actor support to scheduler.

  • Actors

    Git and Zip file source type added to actor.

August 2017

July 2017

  • API

    API endpoint providing results in XML format now allows to set XML tag names.

  • API

    Added support for JSONL output format

  • Web

    Created Crawler request form to help customers specify the crawlers they would like to have built

June 2017

  • Crawler

    Added finish webhook data feature that enables sending of additional info in webhook request payload. (see docs)

  • Web

    Added a feature to delete user account

May 2017

  • Internal

    Improvements in logging system

  • General

    Officially launched Zapier integration

  • Crawler

    Added a new context.actId property that enables users to fetch information about their crawler. (see docs)

  • Internal

    Consolidated logging in the web application, improvements in Admin interface

April 2017

  • Crawler

    Added proxy groups crawler setting to simplify usage of proxy servers (see docs).

  • Web

    Added Schedule button to the crawler details page to simplify scheduling of the crawlers

  • Internal

    Improvements in administration interface

  • Web

    Performance optimizations in UI

  • Web

    Added a tool to test the crawler on a single URL only (see Run console on the crawler details page)

  • Internal

    Improved reports in admin section

  • Web

    Changed Twitter handle from @ApifierInfo to @apifier.

  • Crawler

    Bugfix - cookies set in the last page function were not persisted

  • Internal

    Deployed some upgrades in data storage infrastructure to improve performance and reduce costs

March 2017

  • Web

    Added sorting to Community crawlers.

  • Web

    Bugfixes, performance and cosmetic improvements.

  • Internal

    improvements in administration interface.

  • Web

    Extended public user profile pages in Community crawlers.

  • API

    Bugfix in exports of results in XML format.

  • Crawler

    Added a new context.actExecutionId property that enables users to stop crawler during its execution, fetch results etc. (see docs).

  • Web

    Improvements in internal administration interface.

February 2017

  • Web

    Launched an external Apifier status page page to keep our users informed about system status and potential outages.

  • Web

    Numerous improvements on Community crawlers page, added user profile page, enabled anonymous sharing

  • API

    Improved sorting of columns in CSV/HTML results table - values are now sorted according to numerical indexes (e.g. "val/0", ..., "val/9", "val/10")

  • Web

  • General

    Invoices are now in the PDF format and are sent to customers by email

  • General

    We didn't launch anything today, just wishing you a happy Valentine's Day

  • Web

    New testimonials from ePojisteni.cz and Finbox.io published on our Customers page. Thanks Dušan and Andy!

  • Web

    Released a major upgrade of billing and invoicing infrastructure to support European value-added tax (VAT)

January 2017

  • Web

    Added a new Video tutorials page

  • Crawler

    Improved normalization of URLs which is used by the crawler to determine whether a page has already been visited (see Request.uniqueKey property in docs for more details)

  • Ops

    Changed CDN provider from CloudFlare to AWS CloudFront to improve performance of web and API

  • API

    Bugfix in the start execution API endpoint - synchronous wait would sometimes time out after 60 seconds

  • Internal

    further improvements in administration interface

  • Web

    improved aggregation of usage statistics, now it refreshes automatically

  • Crawler

    Request.proxy is now available even inside of the page function

  • Web

    improved Invoices page

  • Internal

    improvements in administration interface

  • Web

    displaying snapshot of the crawling queue in the Run console

December 2016

  • API

    all paginated API endpoints now support desc=1 query parameter to sort records in descending order

  • API

    added support for XML attributes in results

  • General

    added support for RSS output format to enable creating RSS feeds for any website

  • General

    launched a new discussion forum

  • Crawler

    custom proxy used by a particular request is now saved in Request.proxy field (see Custom proxies in docs)

  • Crawler

    performance improvements

  • API

    enabled rate limiting

  • API

  • API

  • API

    support for synchronous execution of crawlers

  • API

    all endpoints that return lists now support pagination

  • API

    API Reference was greatly improved

  • Web

    Added new Tag and Do not start crawler if previous still running settings to schedules

  • General

    Added new Initial cookies setting to enable users to edit cookies used by their crawlers

November 2016

  • Web

    Added a list of invoices to Account page

  • Web

    Added a new usage stats chart to Account page

  • Internal

    Large improvements in the deployment system completed

  • General

    Increased the length limit for Start URLs to 2000 characters

  • Web

    Showing more relevant statistics in crawler progress bar

  • Web

    Released a new shiny API reference

  • Internal

    Performance and usability improvements in admin interface

  • Internal

    Migrated our main database to MongoDB 3.2, deployed new integration test suite, new metrics in admin interface

October 2016

  • Web

    Showing current service limits on the Account page, various internal improvements in user handling code

  • Web

    Added new example crawlers to demonstrate how to use page's internal JavaScript variable and AJAX calls

  • App

    Released Schedules that enable to automatically run crawlers at certain times.

  • Web

    Switched to Intercom to manage communication with our users

September 2016

  • Web

    Added functionality to test finish webhooks

  • Web

    Security fix - added rel="noopener" to all external links in order to avoid exploitation of the window.opener

  • Web

    Displaying Internal ID field on crawler details page, and User ID and API token token on the Account page to simplify setup of integrations

  • Web

    Added a new Jobs page, because we're hiring!

  • Web

    Deployed various performance optimizations and bugfixes

  • Internal

    Updated our Meteor application to use ES2015 modules

  • Web

    Published a new testimonial from Shopwings on our Customers page. Thanks Guillaume!

  • Crawler

    queuePosition can now also be overridden in interceptRequest function (see docs)

  • Web

    Performance improvements of results exports

  • Web

    Added new example crawler to demonstrate a basic SEO analysis tool

  • Internal

    Upgraded Meteor platform from version 1.3 to 1.4

  • Docs

    Added API property name and type next to each crawler settings (see docs)

  • Crawler

    Added a new context.stats property to pass statistics from the current crawler to user code (see docs).

  • Crawler

    Added a new signature for context.enqueuePage() function that enables placing new pages to beginning of the crawling queue and overriding uniqueKey and label fields (see docs).

  • Crawler

    Enabled users to define custom User-Agent HTTP header, updated the default value to resemble latest Chrome on Windows.

  • Web

    Implemented optimization that enables user to export even large result sets to CSV/HTML format.

  • Web

    Created this wonderful page to keep our users up-to-date with new features