
Metadata Extractor
- jancurn/extract-metadata
- Modified
- Users 952
- Runs 601.1k
- Created by
Jan Čurn
A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.
To run the code examples, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token. For a more detailed explanation, please read about running Actors via the API in Apify Docs.
from apify_client import ApifyClient
# Initialize the ApifyClient with your API token
client = ApifyClient("<YOUR_API_TOKEN>")
# Prepare the Actor input
run_input = {
"urls": [
"https://www.apify.com/",
"https://blog.apify.com",
],
"proxy": { "useApifyProxy": True },
}
# Run the Actor and wait for it to finish
run = client.actor("jancurn/extract-metadata").call(run_input=run_input)
# Fetch and print Actor results from the run's dataset (if there are any)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)