- AppNew access rights system allows you to provide other users with limited access to your actors, tasks, and storages. Read more about access rights in blog post.
- DocsApify’s documentation has been completely redesigned and now has its own domain at docs.apify.com.
- MarketplaceOffers has been launched — a feature on Apify Marketplace that allows developers to prepare proposals for projects and share them directly with customers. See blog post for more information.
- StorageEach of storage types (key-value store, dataset and request queue) can be now named and renamed using API or manually in Apify app.
- APIDataset get items API endpoint now returns unlimited number of items.
- WebSee how web scraping and automation with Apify can help your business at use cases.
- HelpNew help center is available at help.apify.com. It provides search across all the resources such as knowledge base articles, blog, documentation, Stack Overflow and Apify SDK website.
- APIDataset fetched in RSS format now contains
<lastBuildDate>based on its modification date.
- GeneralApify Marketplace is now open for developers to join and earn money by developing custom solutions for Apify customers.
- WevApify website apify.com has been completely redesigned.
- WebPage analyzer improved. Check it out!.
- AppFinished (failed) actor run can be now resurrected back to a
RUNNINGstate. Checkout out the documentation for more information.
- AppScheduler now displays log of both successfull and failed invocations.
- GeneralStarting on Wednesday 31st July 2019, user accounts with disposable (temporary) email addresses will no longer be able to run actors, use Apify Proxy or call Apify API. To retain your access to Apify services, please change your email address on the Account - Settings page to a legitimate email provider.
- ActorsActor and Task inputs specified through INPUT_SCHEMA.json can now be split into collapsible sections. Find out how to configure them here.
- CrawlerApify Crawler is being phased out. Please read this blog post to find out why we are retiring the Crawler product, what it means for you and how you can migrate your crawlers to a new actor, including the integrations.
- APIActor task input can be now retrieved and modified using a special API endpoint.
- AppUsers can now upload custom profile picture in account settings.
- AppUsers can upload image to published actor. This image will soon appear at its public page in Apify Store.
- ActorsStartup times of actors were optimized using CPU boost during the first 10 seconds of run.
- ActorsActor run along with its data can be now shared using public link that is available under the tab "info".
- ActorsTasks now support only JSON encoded input. This also affects API which returns actor task input directly as object under the
inputproperty instead of JSON-encoded pair of
contentType. See API documentation of a get actor task endpoint.
- GeneralApify Forum was moved to Stack Overflow.
- ActorsActor task input can be overloaded in scheduler.
- ActorsAdded limit of 300 characters for description.
- ActorsNew Dockerfile templates for multifile allow faster builds.
- APIRate limit for dataset push items endpoint increased to 300 req/s per store.
- APIAdded actor author username to list actor tasks.
- APIAdded input schema to build detail.
- SchedulerSchedules that use a predefined CRON expression such as
@hourlyrandomly change the base times to ensure that schedules with the same expression will not all start at the same time. This measure is aimed at improving startup times and the performance of your actors and crawlers.
- ActorsTasks can be now easily published as actor. Check out knowledge base article to learn more.
- WebhooksRequest payload can be now modified in webhook configuration. Check out webhooks documentation to learn more.
- WebhooksAd hoc webhooks now support idempotency key to ensure that duplicate webhooks won't get created when actor gets restarted. Check out webhooks documentation to learn more.
- ActorsWeb server running in actor is not required to start in 120sbut can start at any time during the lifespan of its container.
- ActorsGit deployment key is now available via API (get actor endpoint).
- Actors"Use spare CPU capacity" configuration was removed.
- ActorsIncreased maximum memory for actor runs to 32 GB.
- ActorsInput UI for actor now validates proxy configuration.
- ActorsNew validation options added to actor input schema field definitions. For example
minlength of string and array field or regular expression
patternfor values of string list field.
- ProxyURL of Apify proxy now supports new parameter
countrythat restricts proxy IPs selection to given country.
- ActorsRun can now metamorph into run of another actor.
- GeneralOriginal Apify crawler has been open sourced as actor apify/legacy-phantomjs-crawler. This actor has the same input as original Apify crawler and also the same output format.
- APINew set of API endpoints to retrieve and manage the last actor (task) run and its default storages. Check API documentation for more information.
- ActorsSource code editor extended with multifile support, more in documentation.
- ActorsRuns with
RUNNINGstate are now pinned to the top of the actor runs list.
- ActorsNew input UI fields added (key-value pairs, string list, hidden fields). All the field types now support
nullableoption. See documentation page for more information.
- ActorsImproved actor publication page.
- ActorsNew webhooks component enables integration of actors with external services and orchestration of multiple actors into single pipeline.
- ActorsRun console was improved and provides quick overview of actor run storages.
- ActorsPublished actors have new title that is displayed at its public library page.
- DatasetAdded support for hidden fields (i.e. fields starting with the # character). These fields may be used to store debug information such as errors, response codes, etc. that might be easily omitted from output.
- DatasetDataset: Added new parameters to API endpoints returning dataset items -
skipEmpty=trueto omit empty items,
skipHidden=1to omit hidden fields and
clean=truea shortcut for
- APIAll endpoints with
[username]~[resourceName]parameter in URL now support also
- AppCode editor used at Apify app was replaced with modern Monaco editor that supports all ES6 features.
- ActorsMemory limit for free accounts increased to 8GB.
- ActorsPublicated actor can be now marked as deprecated. Deprecated actor will be omitted from public library search and flagged as deprecated. Use this feature to tell people your actor is no longer being developed, since removing it might break integrations that depend on the actor.
- APIRemoved the
meta.clientIpfield from several API endpoints due to privacy concerns
- WebAdded featured actors and crawlers to library. Added input schema and example run to actor detail page.
- AppAdded new section with third-party login services to Account page
- GeneralNumerous performance and stability improvements, bugfixes
- AppDataset detail page now shows preview of the data.
- CLIAdded new commands to manage secrets environment variables, check
apify secrets helpfor more details.
apify.jsonfile structure. It will be updated automatically before execution
apify pushcommand. Read more in the documentation.
- AppA large number of user interface and performance improvements.
- AppNow you can set an additional billing email address that will receive copy of all invoices. To set it, just go to your Subscription page, click Edit, set Billing email and click Update subscription.
- APIApify Storage API endpoints (i.e. key-value stores, datasets, and request queues) that use other than GET HTTP method are now authorized using API token of user. Please see API documentation for more information. Note that we made a special exception in the system that will ensure that affected users will be able to continue using the API the old way. We'll send additional information to these users.
- APINew endpoints providing access to particular version of actor added.
- APIActor task input can be now overloaded via API. See documentation for more information.
- ActorsPrivate Git repositories are now supported. Check documentation for more information.
- ActorsImproved actor UI - run console and source page has been redesigned for better developer experience.
- WebImproved search in library.
- WebA new page with awesome case studies was published.
- WebActors and crawlers in library are now organized by categories.
- Actors"Is exclusive" functionality of scheduler now supports actor also. If this options is checked then scheduler won't start another run as long as previous is still running.
- SDKNew documentation for Apify SDK is now available at https://sdk.apify.com.
- ActorsInput of an actor and its input UI can be now described in input schema.
- TasksReleased Apify actor tasks. Using them, you can create multiple configurations of a single actor and then run the selected configuration directly from Apify Platform, schedule or API.
- ActorsAdded actor live view that enables connecting to running containers - read more on Apify Blog
- AppMajor internal code consolidation and performance improvements
- APIVarious bugfixes and improvements in code and documentation
- ProxyImprovements in Google SERP proxies, adding additional providers
- IntegrationsAdded support for input file from other steps.
- ActorsMemory option for actor runs now supports only values that are power of 2 (ie. 128MB, 256MB, 512MB, 1024MB, 2048MB, ...)!
- CrawlerProxy configuration of crawler now offers "automatic" mode that rotates all the proxies available for a user.
- ActorsEach actor run can now start a web server accessible at a certain unique URL. This enables you to run a web server inside the actor to provide real-time snapshots or receive tasks on the fly. See documentation for more details.
- APIAdded API endpoints to abort Actor run and build.
- ProxyNew Apify Proxy service launched!
- IntegrationsAdded support for running actors in Keboola integration. Check knowledge article for more information.
- ActorsMinimum memory for actor runs is now 128MB.
- CLIAdded log streaming for apify push and apify call commands.
- CLIAdded parameter to clean stores before runs actor locally. Check doc for more information.
- SDKBunch of improvements and new features. Check the changelog.
- CrawlerNow it is not possible to combine custom proxies and Apify proxy groups.
- ActorsRun console now shows information about current/max/avegare CPU and memory.
- ActorsActors are now notified 120s before migration to another worker machine. Check documentation for more information.
- APIAdded a new API end-point to obtain information about a user account
- APIStorage API now also supports use of
[username]~[storage-name]instead of Dataset ID and Key-value store ID.
- CLIWe have just released an Apify CLI (command line tool) to simplify local development, debugging and deployment to Apify.
- StorageNew storage type for Actor platform that helps to manage dynamic queue of URLs to be processed. Check storage documentation for more information.
- SDKapify NPM package contains a lot of new features. Check its changelog for details.
- Actorslimit for number of processes per actor run was increased to
2 x [memory megabytes]so with 2 GB memory your limit is 4000 processes.
- Actorshost machine now sends
migratingevent to actor process in a case of upcoming restart or shutdown. Check documentation.
- Actorsactor runs have now fixed amount of CPU capacity reserved and therefore each run should take about the same time. We also added a new checkbox "Use spare CPU capacity" in actor settings allowing actors to use spare CPU capacity at host machine as free boost.
apify/actor-node-puppeteerDocker image is now deprecated. Use
- ActorsWe have added
apify/actor-node-chrome-xvfbimage that supports non-headless Chrome. If you choose this image then
Apify.launchPuppeteer()opens Puppeteer with non-headless Chrome by default.
- ActorsWe did improvements of our infrastructure to improve actor starts and overall performance.
- ActorsLogs are now rate-limited. Each actor run and build has 10 000 lines log credit with 10 lines added each second. Log lines over the limit won't be available in both UI and API.
- InternalMajor improvements to our Linux server configuration to improve stability and performance of the system
- ActorsActors can now run with 16GB memory (available for users with Medium and large plans see https://apify.com/docs/actor#limits
- ActorsActor runs and their default key-value stores and datasets are now being deleted after data retention period.
- AppWe've added support for PayPal payments for all subscription plans
- ActorsThe actor source code can now come from a GitHub Gist, which is much simpler than having a full Git repository (read the docs)
- SupportWe have re-launched the Knowledge base with a new design and much better search options.
- APIAdded API endpoint to run an actor and get its output in a single HTTP request.
- ActorsWe've added a new storage type Dataset. This enables you to store results in a way similar to Apify Crawler.
- ActorsActor usage statistics are now available in user account.
- ActorsSmarter allocation of tasks to servers to improve performance
- ActorsEnvironment variables can now also be passed to actor builds (as docker
- ActorsAdded option to automatically restart actor runs on error
- CrawlerFixed URL in the
linkelement of RSS formatted last crawler execution result. This bug was causing that some RSS readers never refreshed the data
- CrawlerAdded support for automatic rotation of user agents
- ProxyReleased a new NPM package called proxy-chain to support usage of proxies with password from headless Chrome
- APIAdded support for XLSX output Format for crawler results
- AppUpgraded the web app to Meteor 1.6 and thus greatly improved the speed of the app
- InternalImproved internal notifications, performance and infrastructure improvements
- ActorsAdded feature to enable actor to be anonymously runnable
- GeneralApifier is dead, long live Apify! On 9th October we launched our biggest upgrade yet.
- ActorsAdded actor support to scheduler.
- ActorsGit and Zip file source type added to actor.
- APIAPI endpoint providing results in XML format now allows to set XML tag names.
- APIAdded support for JSONL output format
- WebCreated Crawler request form to help customers specify the crawlers they would like to have built
- WebAdded a feature to delete user account
- CrawlerAdded proxy groups crawler setting to simplify usage of proxy servers (see docs).
- WebAdded Schedule button to the crawler details page to simplify scheduling of the crawlers
- InternalImprovements in administration interface
- WebPerformance optimizations in UI
- WebAdded a tool to test the crawler on a single URL only (see Run console on the crawler details page)
- InternalImproved reports in admin section
- WebChanged Twitter handle from @ApifierInfo to @apifier.
- CrawlerBugfix - cookies set in the last page function were not persisted
- InternalDeployed some upgrades in data storage infrastructure to improve performance and reduce costs
- WebAdded sorting to Community crawlers.
- WebBugfixes, performance and cosmetic improvements.
- Internalimprovements in administration interface.
- WebExtended public user profile pages in Community crawlers.
- APIBugfix in exports of results in XML format.
- CrawlerAdded a new
context.actExecutionIdproperty that enables users to stop crawler during its execution, fetch results etc. (see docs).
- WebImprovements in internal administration interface.
- WebLaunched an external Apifier status page page to keep our users informed about system status and potential outages.
- WebNumerous improvements on Community crawlers page, added user profile page, enabled anonymous sharing
- APIImproved sorting of columns in CSV/HTML results table - values are now sorted according to numerical indexes (e.g. "val/0", ..., "val/9", "val/10")
- WebLaunched Apifier community page
- GeneralInvoices are now in the PDF format and are sent to customers by email
- GeneralWe didn't launch anything today, just wishing you a happy Valentine's Day
- WebReleased a major upgrade of billing and invoicing infrastructure to support European value-added tax (VAT)
- WebAdded a new Video tutorials page
- CrawlerImproved normalization of URLs which is used by the crawler to determine whether a page has already been visited (see Request.uniqueKey property in docs for more details)
- OpsChanged CDN provider from CloudFlare to AWS CloudFront to improve performance of web and API
- APIBugfix in the start execution API endpoint - synchronous wait would sometimes time out after 60 seconds
- Internalfurther improvements in administration interface
- Webimproved aggregation of usage statistics, now it refreshes automatically
- CrawlerRequest.proxy is now available even inside of the page function
- Webimproved Invoices page
- Internalimprovements in administration interface
- Webdisplaying snapshot of the crawling queue in the Run console
- APIall paginated API endpoints now support
desc=1query parameter to sort records in descending order
- APIadded support for XML attributes in results
- Generaladded support for RSS output format to enable creating RSS feeds for any website
- Generallaunched a new discussion forum
- Crawlercustom proxy used by a particular request is now saved in
Request.proxyfield (see Custom proxies in docs)
- Crawlerperformance improvements
- APIenabled rate limiting
- APIMajor API upgrades
- APIsupport for synchronous execution of crawlers
- APIall endpoints that return lists now support pagination
- APIAPI Reference was greatly improved
- WebAdded new Tag and Do not start crawler if previous still running settings to schedules
- GeneralAdded new Initial cookies setting to enable users to edit cookies used by their crawlers
- WebAdded a new usage stats chart to Account page
- InternalLarge improvements in the deployment system completed
- GeneralIncreased the length limit for Start URLs to 2000 characters
- WebShowing more relevant statistics in crawler progress bar
- WebReleased a new shiny API reference
- InternalPerformance and usability improvements in admin interface
- InternalMigrated our main database to MongoDB 3.2, deployed new integration test suite, new metrics in admin interface
- WebShowing current service limits on the Account page, various internal improvements in user handling code
- AppReleased Schedules that enable to automatically run crawlers at certain times.
- WebSwitched to Intercom to manage communication with our users
- WebAdded functionality to test finish webhooks
- WebSecurity fix - added
rel="noopener"to all external links in order to avoid exploitation of the
- WebDisplaying Internal ID field on crawler details page, and User ID and API token token on the Account page to simplify setup of integrations
- WebAdded a new Jobs page, because we're hiring!
- WebDeployed various performance optimizations and bugfixes
- InternalUpdated our Meteor application to use ES2015 modules
queuePositioncan now also be overridden in
interceptRequestfunction (see docs)
- WebPerformance improvements of results exports
- WebAdded new example crawler to demonstrate a basic SEO analysis tool
- InternalUpgraded Meteor platform from version 1.3 to 1.4
- DocsAdded API property name and type next to each crawler settings (see docs)
- CrawlerAdded a new
context.statsproperty to pass statistics from the current crawler to user code (see docs).
- CrawlerAdded a new signature for
context.enqueuePage()function that enables placing new pages to beginning of the crawling queue and overriding
labelfields (see docs).
- CrawlerEnabled users to define custom User-Agent HTTP header, updated the default value to resemble latest Chrome on Windows.
- WebImplemented optimization that enables user to export even large result sets to CSV/HTML format.
- WebCreated this wonderful page to keep our users up-to-date with new features