Food Panda Scraper | All In One | $5 / 1k
Pricing
$4.99 / 1,000 results
Food Panda Scraper | All In One | $5 / 1k
Scrape FoodPanda restaurants and menus across Malaysia, Taiwan, Singapore and more. Grab names, brands, ratings, reviews, prices, delivery fees, promo and more in a clean JSON/CSV. Ideal for competitor research, market mapping, promo tracking and price monitoring.
Pricing
$4.99 / 1,000 results
Rating
5.0
(1)
Developer

Fatih Tahta
Actor stats
0
Bookmarked
5
Total users
1
Monthly active users
12 days ago
Last modified
Categories
Share
Python HTTP Scraper Template (curl_cffi + BeautifulSoup)
A clean and efficient Apify Actor template for building browserless HTTP scrapers in Python.
It uses:
curl_cffifor high-success HTTP requests with browser impersonationBeautifulSoupfor HTML parsing- Apify SDK for Actor lifecycle, datasets, and proxy management
Overview
This template is designed for straightforward scraping tasks where full browser automation is not required.
It:
- performs a warm-up request to the target’s origin to establish a session
- reuses cookies and connections via a shared
curl_cffi.requests.AsyncSession - supports Apify Proxy via the standard
proxyConfigurationinput field - extracts listings using multiple extraction strategies (JSON-LD + generic DOM selectors)
- supports simple pagination (rel="next", “Next” buttons,
?page=Npattern) - saves structured data into the Apify Dataset
Stack
- Python (async, runs on
apify/actor-python:3.13) - Apify SDK (Python) – Actor lifecycle, input, dataset, proxy config
- curl_cffi – fast HTTP client with browser TLS fingerprint impersonation
- BeautifulSoup – HTML parsing and selector-based extraction
Features
- Apify SDK – run scalable Actors on the Apify platform
- curl_cffi AsyncSession – shared session with cookies + connections reused
- Browser impersonation – realistic TLS fingerprints via
impersonate="chrome120" - Proxy support – uses Apify proxy or custom proxies via
proxyConfiguration - Warm-up request – hits the origin once to prime cookies and session state
- Multi-strategy extraction:
- JSON/JSON-LD
<script>blocks (structured data) - generic CSS selectors for listings (
.listing,.result,li.product,article, …)
- JSON/JSON-LD
- Pagination support –
rel="next", “Next/›/»” links, and?page=Nincrement - Monetization hooks – emits
outputrecordcharge events per pushed item (if available) - Structured output – saves extracted data to the default Apify Dataset
How It Works
- Input
- Reads the Actor input via
Actor.get_input():startUrls– list of URLs to crawlqueries– free-text queries turned into search URLs viabuild_search_urllimit– max records to saveproxyConfiguration– standard Apify proxy editor object
-
Seeding
- Normalizes URLs (removes fragments, sorts query params).
- Deduplicates seeds across
startUrlsand query-based search URLs.
-
Proxy & Session
- Creates
ProxyConfigurationviaActor.create_proxy_configuration(actor_proxy_input=proxyConfiguration). - Generates proxy URLs with fresh session IDs to avoid reusing blocked connections.
- Creates a
curl_cffi.requests.AsyncSessionwith:- Chrome-like headers
- impersonation (
chrome120) - proxies
- timeout and redirect limits
- Creates
-
Warm-up
- Derives the origin from the first seed (e.g.
https://www.example.com/). - Sends a warm-up request to prime cookies and the proxy session.
- Derives the origin from the first seed (e.g.
-
Crawling & Extraction
- For each seed, paginates using:
<a rel="next">- links containing “Next”, “›”, “»”
- incremented
?page=Nquery param
- For each page:
- downloads HTML via
curl_cffi - parses with BeautifulSoup
- runs extractors in order:
- JSON/JSON-LD script blocks
- Generic listing selectors
- cleans records to a safe public subset of keys
- downloads HTML via
- For each seed, paginates using:
-
Saving
- Pushes extracted objects to the default Apify Dataset via
Actor.push_data. - Emits a monetization
outputrecordevent for each item (if monetization is enabled). - Stops automatically once
limitis reached.
- Pushes extracted objects to the default Apify Dataset via
Actor Input
startUrls
- Type:
array<string> - List of URLs to crawl directly. Each one is normalized and deduplicated.
queries
- Type:
array<string> - Free-text queries; each is converted into a search URL using
build_search_url().
You should customize
build_search_url()insrc/main.pyper target site.
limit
- Type:
integer - Maximum number of records to save overall.
- Default in code:
1000(the.actor/input_schema.jsoncan override this).
proxyConfiguration
- Type:
object - Standard Apify proxy editor object (
useApifyProxy,apifyProxyGroups, etc.). - Passed directly to
Actor.create_proxy_configuration(actor_proxy_input=...).
Adapting the Template for a Specific Target
-
Set the search base URL
In
src/main.py, change:base = "https://www.example.com/search"