Get Metadata Rental avatar
Get Metadata Rental

Pricing

$10.00/month + usage

Go to Store
Get Metadata Rental

Get Metadata Rental

Developed by

Maged

Maged

Maintained by Community

The get-metadata actor extracts comprehensive metadata including image previews and thumbnails, titles, descriptions, author, time of publish, fav icon, and a lot more

5.0 (1)

Pricing

$10.00/month + usage

0

Total users

1

Monthly users

1

Runs succeeded

>99%

Last modified

2 days ago

Get Metadata Actor (Rental Version)

This Apify actor extracts metadata from web pages. It can process multiple URLs in a single run and supports both Selenium and BeautifulSoup for web scraping.

Features

  • Extract metadata from multiple URLs in a single run
  • Support for both Selenium and BeautifulSoup scraping methods
  • Extract various types of metadata:
    • Page title
    • Meta tags (name, property, charset, http-equiv, itemprop)
    • Link tags (stylesheets, icons, etc.)
    • Language information
  • Filter metadata by name
  • Limit the number of metadata entries per URL
  • Proxy support (Apify proxy or custom proxy)
  • Detailed error reporting

Input

The actor accepts the following input parameters:

{
"target_urls": [ // Optional: Multiple URLs to process
"https://example.com",
"https://example.org"
],
"use_selenium": false, // Optional: Whether to use Selenium (default: false)
"use_proxy": false, // Optional: Whether to use a proxy (default: false)
"proxy_type": "apify", // Optional: Type of proxy to use ("apify" or "custom")
"custom_proxy_url": "", // Optional: Custom proxy URL if proxy_type is "custom"
"limit": 0, // Optional: Maximum number of metadata entries per URL (0 for no limit)
"filter": [ // Optional: List of strings to filter metadata by name
"title",
"description",
"og:"
]
}

Note: You must provide either target_urls (or both). If both are provided, all URLs will be processed.

Output

The actor outputs an array of metadata entries, where each entry has the following structure:

{
"url": "https://example.com", // The URL the metadata was extracted from
"name": "title", // The name of the metadata (e.g., "title", "description", "og:title")
"content": "Example Domain" // The content of the metadata
}

If an error occurs while processing a URL, an error entry will be included in the output:

{
"url": "https://example.com",
"name": "error",
"content": "Error message describing what went wrong"
}

Scraping Methods

The actor supports two methods for extracting metadata:

  1. simple (Default)

    • Faster and more lightweight
    • Suitable for static websites
    • Lower resource usage
    • No JavaScript execution
  2. advanced (selenium)

    • Supports JavaScript-rendered content
    • More resource-intensive
    • Slower but more comprehensive
    • Better for dynamic websites

To use Selenium, set use_selenium to true in the input. By default, the actor uses BeautifulSoup for better performance.

Examples

Basic Usage (Single URL)

{
"target_url": "https://example.com"
}

Multiple URLs with Filtering

{
"target_urls": [
"https://example.com",
"https://example.org"
],
"filter": ["title", "description", "og:title"]
}

Using Proxy

{
"target_urls": [
"https://example.com",
],
"use_proxy": true,
"proxy_type": "apify"
}