Get Metadata avatar
Get Metadata

Pricing

$7.00 / 1,000 results

Go to Store
Get Metadata

Get Metadata

Developed by

Maged

Maged

Maintained by Community

The actor extracts comprehensive metadata including image previews, titles, descriptions, author, time of publish, fav icon, and a lot more

5.0 (1)

Pricing

$7.00 / 1,000 results

0

Total users

1

Monthly users

1

Runs succeeded

>99%

Last modified

2 days ago

Get Metadata Actor (Rental Version)

This Apify actor extracts metadata from web pages. It can process multiple URLs in a single run and supports both Selenium and BeautifulSoup for web scraping.

Rental version https://apify.com/maged120/get-metadata-rental

Features

  • Extract metadata from multiple URLs in a single run
  • Support for both Selenium and BeautifulSoup scraping methods
  • Extract various types of metadata:
    • Page title
    • Meta tags (name, property, charset, http-equiv, itemprop)
    • Link tags (stylesheets, icons, etc.)
    • Language information
  • Filter metadata by name
  • Limit the number of metadata entries per URL
  • Proxy support (Apify proxy or custom proxy)
  • Detailed error reporting

Input

The actor accepts the following input parameters:

{
"target_urls": [ // Optional: Multiple URLs to process
"https://example.com",
"https://example.org"
],
"use_selenium": false, // Optional: Whether to use Selenium (default: false) AVAILABLE ONLY FOR THE RENTAL VERSION
"use_proxy": false, // Optional: Whether to use a proxy (default: false)
"proxy_type": "apify", // Optional: Type of proxy to use ("apify" or "custom")
"custom_proxy_url": "", // Optional: Custom proxy URL if proxy_type is "custom"
"limit": 0, // Optional: Maximum number of metadata entries per URL (0 for no limit)
"filter": [ // Optional: List of strings to filter metadata by name
"title",
"description",
"og:"
]
}

Note: You must provide either target_urls (or both). If both are provided, all URLs will be processed.

Output

The actor outputs an array of metadata entries, where each entry has the following structure:

{
"url": "https://example.com", // The URL the metadata was extracted from
"name": "title", // The name of the metadata (e.g., "title", "description", "og:title")
"content": "Example Domain" // The content of the metadata
}

If an error occurs while processing a URL, an error entry will be included in the output:

{
"url": "https://example.com",
"name": "error",
"content": "Error message describing what went wrong"
}

Scraping Methods

The actor supports two methods for extracting metadata:

  1. simple (Default)

    • Faster and more lightweight
    • Suitable for static websites
    • Lower resource usage
    • No JavaScript execution
  2. advanced (selenium) AVAILABLE ONLY FOR THE RENTAL VERSION

    • Supports JavaScript-rendered content
    • More resource-intensive
    • Slower but more comprehensive
    • Better for dynamic websites

To use Selenium, set use_selenium to true in the input. By default, the actor uses BeautifulSoup for better performance.

Examples

Basic Usage (Single URL)

{
"target_url": "https://example.com"
}

Multiple URLs with Filtering

{
"target_urls": [
"https://example.com",
"https://example.org"
],
"filter": ["title", "description", "og:title"]
}

Using Proxy

{
"target_urls": [
"https://example.com",
],
"use_proxy": true,
"proxy_type": "apify"
}