- ConsoleYou can now create multiple API tokens on the integrations page.
- ConsoleLogs from actor runs are now displayed in color.
- APIEndpoint to get a list of schedules now also returns a list of scheduled actors and actor tasks.
- ConsoleNew batch operations added for tables (resurrect runs, abort runs, schedule tasks, ...).
- Console"Is profile picture public" option replaced with "Is profile public" on the user profile page.
- DocsAdded new documentation section Web Scraping 101 as a complete source of information on web scraping techniques and anti-scraping protections.
- GeneralLead front-end developer wanted! Apply here!
- ConsolePossibility to abandon shared resource (actor, task or storage) added.
- ConsoleAdded SEO fields to actor publication settings.
- ConsoleAccount has been split into 2 pages - Account + Usage and billing.
- DocsStorage, Schedules and API sections updated.
- SDKSDK 0.21.1 released. Check details on GitHub.
- MarketplacePossibility to hide closed projects from "My jobs". Added notification to dataexpert for rejected offer.
- GeneralSuccessfully launched on Product Hunt. Make sure to upvote us!
- ConsoleUI Improvements - actors, tasks, schedules, dashboard, sign-in/up form and loading animation.
- ConsolePerformance improvements for actors and tasks.
- DocsBreadcrumps introduced and categories added into the main navigation.
- SDKApify SDK 0.21.0 released - check details on GitHub.
- WebComplete visual redesign of this website. 🎨
- WebNew success story added - Keyword Research and Content Ideation Tool.
- ConsoleImproved overall performance of the Apify console including load time by optimizing size of the application bundle to about a half of the original size.
- ConsoleRemaining storage tables rewritten for better performance and to remove the limit on how many items can be viewed.
- ConsoleYou can now download the complete source code of an actor as a ZIP archive directly from the multi-file code editor.
- DocsApify documentation is being restructured and extended with new sections.
- ConsoleSchedules now support a time zone selection concerning a DST time.
- WebActor detail page - "How to run" section redesigned and renamed to API.
- WebIntroduced the solution providers page - Boost your revenue by adding our web scraping and RPA solutions as part of your product portfolio.
- ConsoleActor (task) runs tables rewritten for better performance and to remove the limit on how many items can be viewed.
- WebNew Affiliate program introduced - Send a customer our way and earn a recurring commission as our affiliate with no limits on what you can earn.
- WebPublished the new Custom solutions page - Marketplace or Enterprise? Select the option that suits your needs.
- WebOverview table added to the COVID-19 page.
- GeneralWe wanted to do our part in the fight against COVID-19, so we turned official pages with statistics into APIs that can be used by other apps. Use these coronavirus statistics APIs to get the latest data from multiple countries.
- DocsDocumentation is being continuously improved with new features such as a table of contents and links to previous/next articles.
- APISchedules can be now pragmatically managed using an API. See the documentation for more information.
- ScrapersThe Web Scraper (apify/web-scraper) actor development experience has become much smoother with the addition of a state-of-the-art remote inspector, Chrome DevTools. In the Web Scraper input, set the Run type to DEVELOPMENT and enjoy a full DevTools experience either in the live view tab or in a separate window by visiting the container URL. See advanced configuration for breakpoint management and go kill them bugs.
- SDKRedesigned the existing examples in Apify SDK documentation page and added a bunch of new ones, addressing the most frequently asked questions on our support channels. Check them out in our examples section.
- GeneralWe’ve recently launched a referral program. Send a customer our way and earn a recurring commission as our affiliate. Interested in becoming a referral partner? Get in touch at email@example.com.
- ProxyDuring the free trial, Apify Proxy is available only from actors running on the Apify platform.
- ConsoleTables across app are being rewritten to speed up the load and to remove the limit how many items are retrieved. Check out the new key-value store view. Other tables will follow.
- ConsoleMemory settings of actor (task) now shows how many CPU cores will be proportionally allocated for actor run.
- ConsoleUser can now optionally override existing actor when making a copy of an actor.
- ConsoleAccount settings including user subscription are now located in the top right of Apify console under the user icon.
- ConsoleIn-app chat is now hidden by default. If you need to contact support, you can always find it in the Help menu at the top of the page under the
?icon and choosing "Chat with support" option.
- DocsSearch added to a documentation.
- StorageDownloaded datasets have human fiendly names.
- DocsMany improvements in docs and actor readme files such as syntax highlighting.
- ConsoleNew access rights system allows you to provide other users with limited access to your actors, tasks, and storages. Read more about access rights in blog post.
- DocsApify’s documentation has been completely redesigned and now has its own domain at docs.apify.com.
- MarketplaceOffers has been launched — a feature on Apify Marketplace that allows developers to prepare proposals for projects and share them directly with customers. See blog post for more information.
- StorageEach of storage types (key-value store, dataset and request queue) can be now named and renamed using API or manually in Apify console.
- APIDataset get items API endpoint now returns unlimited number of items.
- WebSee how web scraping and automation with Apify can help your business at use cases.
- HelpNew help center is available at help.apify.com. It provides search across all the resources such as knowledge base articles, blog, documentation, Stack Overflow and Apify SDK website.
- APIDataset fetched in RSS format now contains
<lastBuildDate>based on its modification date.
- GeneralApify Marketplace is now open for developers to join and earn money by developing custom solutions for Apify customers.
- WebApify website apify.com has been completely redesigned.
- ConsoleFinished (failed) actor run can be now resurrected back to a
RUNNINGstate. Checkout out the documentation for more information.
- ConsoleScheduler now displays log of both successfull and failed invocations.
- GeneralStarting on Wednesday 31st July 2019, user accounts with disposable (temporary) email addresses will no longer be able to run actors, use Apify Proxy or call Apify API. To retain your access to Apify services, please change your email address on the Account - Settings page to a legitimate email provider.
- ActorActor and Task inputs specified through INPUT_SCHEMA.json can now be split into collapsible sections. Find out how to configure them here.
- CrawlerApify Crawler is being phased out. Please read this blog post to find out why we are retiring the Crawler product, what it means for you and how you can migrate your crawlers to a new actor, including the integrations.
- APIActor task input can be now retrieved and modified using a special API endpoint.
- ConsoleUsers can now upload custom profile picture in account settings.
- ConsoleUsers can upload image to published actor. This image will soon appear at its public page in Apify Store.
- ActorStartup times of actors were optimized using CPU boost during the first 10 seconds of run.
- ActorActor run along with its data can be now shared using public link that is available under the tab "info".
- ActorTasks now support only JSON encoded input. This also affects API which returns actor task input directly as object under the
inputproperty instead of JSON-encoded pair of
contentType. See API documentation of a get actor task endpoint.
- GeneralApify Forum was moved to Stack Overflow.
- ActorActor task input can be overloaded in scheduler.
- ActorAdded limit of 300 characters for description.
- ActorNew Dockerfile templates for multifile allow faster builds.
- APIRate limit for dataset push items endpoint increased to 300 req/s per store.
- APIAdded actor author username to list actor tasks.
- APIAdded input schema to build detail.
- ConsoleSchedules that use a predefined CRON expression such as
@hourlyrandomly change the base times to ensure that schedules with the same expression will not all start at the same time. This measure is aimed at improving startup times and the performance of your actors and crawlers.
- ActorTasks can be now easily published as actor. Check out knowledge base article to learn more.
- ConsoleRequest payload can be now modified in webhook configuration. Check out webhooks documentation to learn more.
- ConsoleAd hoc webhooks now support idempotency key to ensure that duplicate webhooks won't get created when actor gets restarted. Check out webhooks documentation to learn more.
- ActorWeb server running in actor is not required to start in 120sbut can start at any time during the lifespan of its container.
- ActorGit deployment key is now available via API (get actor endpoint).
- Actor"Use spare CPU capacity" configuration was removed.
- ActorIncreased maximum memory for actor runs to 32 GB.
- ActorInput UI for actor now validates proxy configuration.
- ActorNew validation options added to actor input schema field definitions. For example
minlength of string and array field or regular expression
patternfor values of string list field.
- ProxyURL of Apify proxy now supports new parameter
countrythat restricts proxy IPs selection to given country.
- ActorRun can now metamorph into run of another actor.
- GeneralOriginal Apify crawler has been open sourced as actor apify/legacy-phantomjs-crawler. This actor has the same input as original Apify crawler and also the same output format.
- APINew set of API endpoints to retrieve and manage the last actor (task) run and its default storages. Check API documentation for more information.
- ActorSource code editor extended with multifile support, more in documentation.
- ActorRuns with
RUNNINGstate are now pinned to the top of the actor runs list.
- ActorNew input UI fields added (key-value pairs, string list, hidden fields). All the field types now support
nullableoption. See documentation page for more information.
- ActorImproved actor publication page.
- ActorNew webhooks component enables integration of actors with external services and orchestration of multiple actors into single pipeline.
- ActorRun console was improved and provides quick overview of actor run storages.
- ActorPublished actors have new title that is displayed at its public library page.
- StorageDataset now supports hidden fields (i.e. fields starting with the # character). These fields may be used to store debug information such as errors, response codes, etc. that might be easily omitted from output.
- StorageAdded new parameters to API endpoints returning dataset items -
skipEmpty=trueto omit empty items,
skipHidden=1to omit hidden fields and
clean=truea shortcut for
- APIAll endpoints with
[username]~[resourceName]parameter in URL now support also
- ConsoleCode editor used at Apify console was replaced with modern Monaco editor that supports all ES6 features.
- ActorMemory limit for free accounts increased to 8GB.
- ActorPublicated actor can be now marked as deprecated. Deprecated actor will be omitted from public library search and flagged as deprecated. Use this feature to tell people your actor is no longer being developed, since removing it might break integrations that depend on the actor.
- APIRemoved the
meta.clientIpfield from several API endpoints due to privacy concerns
- WebAdded featured actors and crawlers to library. Added input schema and example run to actor detail page.
- ConsoleAdded new section with third-party login services to Account page
- GeneralNumerous performance and stability improvements, bugfixes
- ConsoleDataset detail page now shows preview of the data.
- CLIAdded new commands to manage secrets environment variables, check
apify secrets helpfor more details.
apify.jsonfile structure. It will be updated automatically before execution
apify pushcommand. Read more in the documentation.
- ConsoleA large number of user interface and performance improvements.
- ConsoleNow you can set an additional billing email address that will receive copy of all invoices. To set it, just go to your Subscription page, click Edit, set Billing email and click Update subscription.
- APIApify Storage API endpoints (i.e. key-value stores, datasets, and request queues) that use other than GET HTTP method are now authorized using API token of user. Please see API documentation for more information. Note that we made a special exception in the system that will ensure that affected users will be able to continue using the API the old way. We'll send additional information to these users.
- APINew endpoints providing access to particular version of actor added.
- APIActor task input can be now overloaded via API. See documentation for more information.
- ActorPrivate Git repositories are now supported. Check documentation for more information.
- ActorImproved actor UI - run console and source page has been redesigned for better developer experience.
- WebImproved search in library.
- WebA new page with awesome case studies was published.
- WebActors and crawlers in library are now organized by categories.
- Actor"Is exclusive" functionality of scheduler now supports actor also. If this options is checked then scheduler won't start another run as long as previous is still running.
- SDKNew documentation for Apify SDK is now available at https://sdk.apify.com.
- ActorInput of an actor and its input UI can be now described in input schema.
- ActorReleased Apify actor tasks. Using them, you can create multiple configurations of a single actor and then run the selected configuration directly from Apify Platform, schedule or API.
- ActorAdded actor live view that enables connecting to running containers - read more on Apify Blog
- ConsoleMajor internal code consolidation and performance improvements
- APIVarious bugfixes and improvements in code and documentation
- ProxyImprovements in Google SERP proxies, adding additional providers
- IntegrationsAdded support for input file from other steps.
- ActorMemory option for actor runs now supports only values that are power of 2 (ie. 128MB, 256MB, 512MB, 1024MB, 2048MB, ...)!
- CrawlerProxy configuration of crawler now offers "automatic" mode that rotates all the proxies available for a user.
- ActorEach actor run can now start a web server accessible at a certain unique URL. This enables you to run a web server inside the actor to provide real-time snapshots or receive tasks on the fly. See documentation for more details.
- APIAdded API endpoints to abort Actor run and build.
- ProxyNew Apify Proxy service launched!
- IntegrationsAdded support for running actors in Keboola integration. Check knowledge article for more information.
- ActorMinimum memory for actor runs is now 128MB.
- CLIAdded log streaming for apify push and apify call commands.
- CLIAdded parameter to clean stores before runs actor locally. Check doc for more information.
- SDKBunch of improvements and new features. Check the changelog.
- CrawlerNow it is not possible to combine custom proxies and Apify proxy groups.
- ActorRun console now shows information about current/max/avegare CPU and memory.
- ActorActors are now notified 120s before migration to another worker machine. Check documentation for more information.
- APIAdded a new API end-point to obtain information about a user account
- APIStorage API now also supports use of
[username]~[storage-name]instead of Dataset ID and Key-value store ID.
- CLIWe have just released an Apify CLI (command line tool) to simplify local development, debugging and deployment to Apify.
- StorageNew storage type for Actor platform that helps to manage dynamic queue of URLs to be processed. Check storage documentation for more information.
- SDKapify NPM package contains a lot of new features. Check its changelog for details.
- Actorlimit for number of processes per actor run was increased to
2 x [memory megabytes]so with 2 GB memory your limit is 4000 processes.
- Actorhost machine now sends
migratingevent to actor process in a case of upcoming restart or shutdown. Check documentation.
- Actoractor runs have now fixed amount of CPU capacity reserved and therefore each run should take about the same time. We also added a new checkbox "Use spare CPU capacity" in actor settings allowing actors to use spare CPU capacity at host machine as free boost.
apify/actor-node-puppeteerDocker image is now deprecated. Use
- ActorWe have added
apify/actor-node-chrome-xvfbimage that supports non-headless Chrome. If you choose this image then
Apify.launchPuppeteer()opens Puppeteer with non-headless Chrome by default.
- ActorWe did improvements of our infrastructure to improve actor starts and overall performance.
- ActorLogs are now rate-limited. Each actor run and build has 10 000 lines log credit with 10 lines added each second. Log lines over the limit won't be available in both UI and API.
- WebLaunched Page Analyzer tool to enable setting up crawlers with less manual steps. Read more on Apify blog.
- InternalMajor improvements to our Linux server configuration to improve stability and performance of the system.
- ActorActors can now run with 16GB memory (available for users with Medium and large plans see https://docs.apify.com/actors/limits
- ActorActor runs and their default key-value stores and datasets are now being deleted after data retention period.
- ConsoleWe've added support for PayPal payments for all subscription plans
- ActorThe actor source code can now come from a GitHub Gist, which is much simpler than having a full Git repository (read the docs)
- HelpWe have re-launched the Knowledge base with a new design and much better search options.
- APIAdded API endpoint to run an actor and get its output in a single HTTP request.
- ActorWe've added a new storage type Dataset. This enables you to store results in a way similar to Apify Crawler.
- ActorActor usage statistics are now available in user account.
- ActorSmarter allocation of tasks to servers to improve performance
- ActorEnvironment variables can now also be passed to actor builds (as docker
- ActorAdded option to automatically restart actor runs on error
- CrawlerFixed URL in the
linkelement of RSS formatted last crawler execution result. This bug was causing that some RSS readers never refreshed the data
- CrawlerAdded support for automatic rotation of user agents
- ProxyReleased a new NPM package called proxy-chain to support usage of proxies with password from headless Chrome
- APIAdded support for XLSX output Format for crawler results
- ConsoleUpgraded the web app to Meteor 1.6 and thus greatly improved the speed of the app
- InternalImproved internal notifications, performance and infrastructure improvements.
- ActorAdded feature to enable actor to be anonymously runnable
- GeneralApifier is dead, long live Apify! On 9th October we launched our biggest upgrade yet.
- ActorAdded actor support to scheduler.
- ActorGit and Zip file source type added to actor.
- APIAPI endpoint providing results in XML format now allows to set XML tag names.
- APIAdded support for JSONL output format
- WebCreated Crawler request form to help customers specify the crawlers they would like to have built
- WebAdded a feature to delete user account
- CrawlerAdded proxy groups crawler setting to simplify usage of proxy servers (see docs).
- WebAdded Schedule button to the crawler details page to simplify scheduling of the crawlers
- InternalImprovements in administration interface
- WebPerformance optimizations in UI
- WebAdded a tool to test the crawler on a single URL only (see Run console on the crawler details page)
- InternalImproved reports in admin section
- WebChanged Twitter handle from @ApifierInfo to @apifier.
- CrawlerBugfix - cookies set in the last page function were not persisted
- InternalDeployed some upgrades in data storage infrastructure to improve performance and reduce costs
- WebAdded sorting to Community crawlers.
- WebBugfixes, performance and cosmetic improvements.
- Internalimprovements in administration interface.
- WebExtended public user profile pages in Community crawlers.
- APIBugfix in exports of results in XML format.
- CrawlerAdded a new
context.actExecutionIdproperty that enables users to stop crawler during its execution, fetch results etc. (see docs).
- WebImprovements in internal administration interface.
- WebLaunched an external Apifier status page page to keep our users informed about system status and potential outages.
- WebNumerous improvements on Community crawlers page, added user profile page, enabled anonymous sharing
- APIImproved sorting of columns in CSV/HTML results table - values are now sorted according to numerical indexes (e.g. "val/0", ..., "val/9", "val/10")
- WebLaunched Apifier community page
- GeneralInvoices are now in the PDF format and are sent to customers by email
- GeneralWe didn't launch anything today, just wishing you a happy Valentine's Day
- WebReleased a major upgrade of billing and invoicing infrastructure to support European value-added tax (VAT)
- WebAdded a new Video tutorials page
- CrawlerImproved normalization of URLs which is used by the crawler to determine whether a page has already been visited (see Request.uniqueKey property in docs for more details)
- InternalChanged CDN provider from CloudFlare to AWS CloudFront to improve performance of web and API
- APIBugfix in the start execution API endpoint - synchronous wait would sometimes time out after 60 seconds
- Internalfurther improvements in administration interface
- Webimproved aggregation of usage statistics, now it refreshes automatically
- CrawlerRequest.proxy is now available even inside of the page function
- Webimproved Invoices page
- Internalimprovements in administration interface
- Webdisplaying snapshot of the crawling queue in the Run console
- APIall paginated API endpoints now support
desc=1query parameter to sort records in descending order
- APIadded support for XML attributes in results
- Generaladded support for RSS output format to enable creating RSS feeds for any website
- Generallaunched a new discussion forum
- Crawlercustom proxy used by a particular request is now saved in
Request.proxyfield (see Custom proxies in docs)
- Crawlerperformance improvements
- APIenabled rate limiting
- APIMajor API upgrades
- APIadded new endpoints to update and delete crawlers
- APIsupport for synchronous execution of crawlers
- APIall endpoints that return lists now support pagination
- APIAPI Reference was greatly improved
- WebAdded new Tag and Do not start crawler if previous still running settings to schedules
- GeneralAdded new Initial cookies setting to enable users to edit cookies used by their crawlers
- WebAdded a new usage stats chart to Account page
- InternalLarge improvements in the deployment system completed
- GeneralIncreased the length limit for Start URLs to 2000 characters
- WebShowing more relevant statistics in crawler progress bar
- WebReleased a new shiny API reference
- InternalPerformance and usability improvements in admin interface
- InternalMigrated our main database to MongoDB 3.2, deployed new integration test suite, new metrics in admin interface
- WebShowing current service limits on the Account page, various internal improvements in user handling code
- ConsoleReleased Schedules that enable to automatically run crawlers at certain times.
- WebSwitched to Intercom to manage communication with our users
- WebAdded functionality to test finish webhooks
- WebSecurity fix - added
rel="noopener"to all external links in order to avoid exploitation of the
- WebDisplaying Internal ID field on crawler details page, and User ID and API token token on the Account page to simplify setup of integrations
- WebAdded a new Jobs page, because we're hiring!
- WebDeployed various performance optimizations and bugfixes
- InternalUpdated our Meteor application to use ES2015 modules
queuePositioncan now also be overridden in
interceptRequestfunction (see docs)
- WebPerformance improvements of results exports
- WebAdded new example crawler to demonstrate a basic SEO analysis tool
- InternalUpgraded Meteor platform from version 1.3 to 1.4
- DocsAdded API property name and type next to each crawler settings (see docs)
- CrawlerAdded a new
context.statsproperty to pass statistics from the current crawler to user code (see docs).
- CrawlerAdded a new signature for
context.enqueuePage()function that enables placing new pages to beginning of the crawling queue and overriding
labelfields (see docs).
- CrawlerEnabled users to define custom User-Agent HTTP header, updated the default value to resemble latest Chrome on Windows.
- WebImplemented optimization that enables user to export even large result sets to CSV/HTML format.
- WebCreated this wonderful page to keep our users up-to-date with new features