Twitter Hashtag Scraper
This Twitter Hashtag Scraper will scrape and extracts all tweets for given hashtag and provide output in JSON, XML, CSV or HTML.
Article Text Extractor
Simply extracts article text and other meta info from given url. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.
Crawler To Spreadsheet
This crawler takes last crawler run result and stores new items in Google Docs Spreadsheet.
Example Hacker News
Example crawler for news.ycombinator.com build using Apify SDK
Url List Download Html
This act accepts a url list and downloads HTML of each page. It has input parameter - "sources" (see soursec parameter of UrlList https://www.apify.com/docs/sdk/apify-runtime-js/beta#RequestList).
Aliexpress.com - own orders
Get all your orders from aliexpress.com in machine readable format.
Crawl Url List 1by1
Crawls given list of urls with one crawler execution per url.
Skoda-auto.cz - model variants
Get all model-engine-equipment package variants of Škoda Auto cars.
This act creates a timeline spreadsheet from crawler results. Main use-case is to create a spreadsheet containing changes of some web page in time.
Scrapes the links with their rank from HN Show. Created for this blogpost https://medium.com/p/8cccfa25f5cb/edit
Puppeteer Promise Pool Example
Example how to use Puppeteer in parallel using 'es6-promise-pool' npm package.
Crawler To Sitemap
This act can be used as crawler's finish webhook. It transforms crawler's result into sitemap XML file and stores it in key-value-store named "sitemaps".
Xmls To Dataset
This act loads list of urls from INPUT.sources. Each of these links should point to a xml file. It downloads all the files and saves them to it's default dataset. Groups parameter in INPUT allows to choose Apify proxy groups to us...
This actor simply tests given array of URLs against selected proxy URLs or Apify proxy groups.
24 Hour Stats
This act can be used as synchronous API. Returns a JSON containing actor runs finished in the last 24 hours along with information about their default datasets and request queues. Actors might be filtered via input array "actIds".
Delete Untitled Acts
Deletes all actors and tasks named untitled-X, my-actor-X, my-task-X from your account. In a minute. For free. With one click!