Empty Python project
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
# Apify SDK - toolkit for building Apify Actors (Read more at https://docs.apify.com/sdk/python)
from apify import Actor
# Beautiful Soup - library for pulling data out of HTML and XML files (Read more at https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
# from bs4 import BeautifulSoup
async def main():
async with Actor:
print('Hello from the Actor!')
"""
Actor code
"""
Empty Python template
Start a new web scraping project quickly and easily in Python with our empty project template. It provides a basic structure for the Actor with Apify SDK and allows you to easily add your own functionality.
Included features
- Apify SDK for Python - a toolkit for building Actors and scrapers in Python
- Beautiful Soup - library for pulling data out of HTML and XML files
How it works
Insert your own code to async with Actor:
block. You can use the Apify SDK, Beautiful Soup or any other Python library.
Resources
- Python tutorials in Academy
- Video guide on getting data using Apify API
- Integration with Make, GitHub, Zapier, Google Drive, and other apps
A short guide on how to build web scrapers using code templates: web scraper template
Scrape single page with provided URL with Requests and extract data from page's HTML with Beautiful Soup.
Example of a web scraper that uses Python Requests to scrape HTML from URLs provided on input, parses it using BeautifulSoup and saves results to storage.
Crawler example that uses headless Chrome driven by Playwright to scrape a website. Headless browsers render JavaScript and can help when getting blocked.
Scraper example built with Selenium and headless Chrome browser to scrape a website and save the results to storage. A popular alternative to Playwright.
This example Scrapy spider scrapes page titles from URLs defined in input parameter. It shows how to use Apify SDK for Python and Scrapy pipelines to save results.