Cheerio Scraper avatar

Cheerio Scraper

Try for free

No credit card required

Go to Store
Cheerio Scraper

Cheerio Scraper

apify/cheerio-scraper
Try for free

No credit card required

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

Do you want to learn more about this Actor?

Get a demo
MA

How to return response within the POST request.

Closed

mr_apify opened this issue
a month ago

I want the curl request to return the scraped data directly in the terminal within 1 request, but it only returns me settings from the run.

curl "https://api.apify.com/v2/acts/apify~cheerio-scraper/runs?token="
-X POST
-H 'Content-Type: application/json'
-d '{ "startUrls": [ { "url": "https://example.com/" }, { "url": "https://tonytong.mystrikingly.com/" } ], "linkSelector": "a[href]", "pageFunction": "async function pageFunction(context) { const { $, request, log } = context; const pageTitle = $("title").first().text(); const pageContent = $("body").html(); log.info("Page scraped", { url: request.url, pageTitle }); return { url: request.url, pageTitle, content: pageContent }; }", "proxyConfiguration": { "useApifyProxy": true } }'

Can you check and let me know how to change this query?

MA

mr_apify

a month ago

I found it, need to use run-sync-get-dataset-items in url

Developer
Maintained by Apify

Actor Metrics

  • 442 monthly users

  • 93 stars

  • >99% runs succeeded

  • 28 days response time

  • Created in Apr 2019

  • Modified 2 months ago