ScrapyFy
Pricing
Pay per usage
Go to Apify Store
Pricing
Pay per usage
Rating
0.0
(0)
Developer

cat
Maintained by Community
Actor stats
1
Bookmarked
16
Total users
2
Monthly active users
2 years ago
Last modified
Categories
Share
Pricing
Pay per usage
Pricing
Pay per usage
Rating
0.0
(0)
Developer

cat
Actor stats
1
Bookmarked
16
Total users
2
Monthly active users
2 years ago
Last modified
Categories
Share
You can access the ScrapyFy programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.
$echo '{< "spiders_code": "from urllib.parse import urljoin\\r\\n\\r\\n### multiple spiders can be specified\\r\\n\\r\\nclass TitleSpider(scrapy.Spider):\\r\\n\\r\\n name = '\''title_spider'\''\\r\\n allowed_domains = [\\"apify.com\\"]\\r\\n start_urls = [\\"https://apify.com\\"]\\r\\n\\r\\n custom_settings = {\\r\\n '\''REQUEST_FINGERPRINTER_IMPLEMENTATION'\'' : '\''2.7'\'',\\r\\n # Obey robots.txt rules\\r\\n '\''ROBOTSTXT_OBEY'\'' : True,\\r\\n '\''DEPTH_LIMIT'\'' : 2,\\r\\n '\''LOG_ENABLED'\'' : False,\\r\\n #'\''CLOSESPIDER_PAGECOUNT'\'' : 5,\\r\\n '\''CLOSESPIDER_ITEMCOUNT'\'' : 5,\\r\\n }\\r\\n\\r\\n def parse(self, response):\\r\\n yield {\\r\\n '\''url'\'': response.url,\\r\\n '\''title'\'': response.css('\''title::text'\'').extract_first(),\\r\\n }\\r\\n for link_href in response.css('\''a::attr(\\"href\\")'\''):\\r\\n link_url = urljoin(response.url, link_href.get())\\r\\n if link_url.startswith(('\''http://'\'', '\''https://'\'')):\\r\\n yield scrapy.Request(link_url)",< "DEFAULT_REQUEST_HEADERS": {< "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",< "Accept-Language": "en"< },< "DOWNLOADER_MIDDLEWARES": {},< "DOWNLOADER_MIDDLEWARES_BASE": {< "scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware": 100,< "scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware": 300,< "scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware": 350,< "scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware": 400,< "scrapy.downloadermiddlewares.useragent.UserAgentMiddleware": 500,< "scrapy.downloadermiddlewares.retry.RetryMiddleware": 550,< "scrapy.downloadermiddlewares.ajaxcrawl.AjaxCrawlMiddleware": 560,< "scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware": 580,< "scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware": 590,< "scrapy.downloadermiddlewares.redirect.RedirectMiddleware": 600,< "scrapy.downloadermiddlewares.cookies.CookiesMiddleware": 700,< "scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware": 750,< "scrapy.downloadermiddlewares.stats.DownloaderStats": 850,< "scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware": 900< },< "DOWNLOAD_HANDLERS": {},< "DOWNLOAD_HANDLERS_BASE": {< "data": "scrapy.core.downloader.handlers.datauri.DataURIDownloadHandler",< "file": "scrapy.core.downloader.handlers.file.FileDownloadHandler",< "http": "scrapy.core.downloader.handlers.http.HTTPDownloadHandler",< "https": "scrapy.core.downloader.handlers.http.HTTPDownloadHandler",< "s3": "scrapy.core.downloader.handlers.s3.S3DownloadHandler",< "ftp": "scrapy.core.downloader.handlers.ftp.FTPDownloadHandler"< },< "EXTENSIONS": {},< "EXTENSIONS_BASE": {< "scrapy.extensions.corestats.CoreStats": 0,< "scrapy.extensions.telnet.TelnetConsole": 0,< "scrapy.extensions.memusage.MemoryUsage": 0,< "scrapy.extensions.memdebug.MemoryDebugger": 0,< "scrapy.extensions.closespider.CloseSpider": 0,< "scrapy.extensions.feedexport.FeedExporter": 0,< "scrapy.extensions.logstats.LogStats": 0,< "scrapy.extensions.spiderstate.SpiderState": 0,< "scrapy.extensions.throttle.AutoThrottle": 0< },< "FEEDS": {},< "FEED_EXPORTERS": {},< "FEED_EXPORTERS_BASE": {< "json": "scrapy.exporters.JsonItemExporter",< "jsonlines": "scrapy.exporters.JsonLinesItemExporter",< "jsonl": "scrapy.exporters.JsonLinesItemExporter",< "jl": "scrapy.exporters.JsonLinesItemExporter",< "csv": "scrapy.exporters.CsvItemExporter",< "xml": "scrapy.exporters.XmlItemExporter",< "marshal": "scrapy.exporters.MarshalItemExporter",< "pickle": "scrapy.exporters.PickleItemExporter"< },< "FEED_STORAGES": {},< "FEED_STORAGES_BASE": {< "": "scrapy.extensions.feedexport.FileFeedStorage",< "file": "scrapy.extensions.feedexport.FileFeedStorage",< "ftp": "scrapy.extensions.feedexport.FTPFeedStorage",< "gs": "scrapy.extensions.feedexport.GCSFeedStorage",< "s3": "scrapy.extensions.feedexport.S3FeedStorage",< "stdout": "scrapy.extensions.feedexport.StdoutFeedStorage"< },< "HTTPCACHE_IGNORE_HTTP_CODES": [],< "HTTPCACHE_IGNORE_RESPONSE_CACHE_CONTROLS": [],< "HTTPCACHE_IGNORE_SCHEMES": [< "file"< ],< "ITEM_PIPELINES": {},< "ITEM_PIPELINES_BASE": {},< "MEMDEBUG_NOTIFY": [],< "MEMUSAGE_NOTIFY_MAIL": [],< "METAREFRESH_IGNORE_TAGS": [],< "RETRY_HTTP_CODES": [< 500,< 502,< 503,< 504,< 522,< 524,< 408,< 429< ],< "SPIDER_CONTRACTS": {},< "SPIDER_CONTRACTS_BASE": {< "scrapy.contracts.default.UrlContract": 1,< "scrapy.contracts.default.CallbackKeywordArgumentsContract": 1,< "scrapy.contracts.default.ReturnsContract": 2,< "scrapy.contracts.default.ScrapesContract": 3< },< "SPIDER_MIDDLEWARES": {},< "SPIDER_MIDDLEWARES_BASE": {< "scrapy.spidermiddlewares.httperror.HttpErrorMiddleware": 50,< "scrapy.spidermiddlewares.offsite.OffsiteMiddleware": 500,< "scrapy.spidermiddlewares.referer.RefererMiddleware": 700,< "scrapy.spidermiddlewares.urllength.UrlLengthMiddleware": 800,< "scrapy.spidermiddlewares.depth.DepthMiddleware": 900< },< "SPIDER_MODULES": [],< "STATSMAILER_RCPTS": [],< "TELNETCONSOLE_PORT": [< 6023,< 6073< ]<}' |<apify call jupri/scrapyfy --silent --output-datasetThe Apify CLI is the official tool that allows you to use ScrapyFy locally, providing convenience functions and automatic retries on errors.
Using installation script (macOS/Linux):
$curl -fsSL https://apify.com/install-cli.sh | bashUsing installation script (Windows):
$irm https://apify.com/install-cli.ps1 | iexUsing Homebrew:
$brew install apify-cliUsing NPM:
$npm install -g apify-cliOther API clients include: