Fetch Branding avatar
Fetch Branding

Pricing

$15.00 / 1,000 results

Go to Apify Store
Fetch Branding

Fetch Branding

Apify Actor for extracting branding information from websites including logos, colors, metadata, and social links.

Pricing

$15.00 / 1,000 results

Rating

0.0

(0)

Developer

Jotunweb

Jotunweb

Maintained by Community

Actor stats

0

Bookmarked

35

Total users

11

Monthly active users

2 months ago

Last modified

Share

Website Branding Extractor

An Apify Actor that extracts comprehensive branding information from websites including logos, colors, metadata, and social media links.

Features

  • Logo Extraction: Favicon, apple-touch-icon, Open Graph images, and common logo selectors
  • Color Detection: CSS custom properties and inline style colors
  • Metadata Extraction: Meta tags, Open Graph, and Twitter Card data
  • Social Media Links: Detects links to major social platforms
  • Error Handling: Comprehensive error reporting for failed extractions
  • Configurable Output: Filter results by success status and limit processing

How it works

  1. Input Processing: Accepts URLs via startUrls array
  2. HTTP Fetching: Uses Axios with timeout, user-agent spoofing, and redirect handling
  3. HTML Parsing: Parses content with Cheerio for data extraction
  4. Multi-faceted Extraction: Simultaneously extracts logos, colors, metadata, and social links
  5. Result Filtering: Applies configurable filters based on success status and limits
  6. Data Storage: Stores results in Apify Dataset with multiple view options

Input Configuration

The Actor accepts the following input parameters:

URLs

  • Start URLs: Array of websites to extract branding from

Extraction Options

  • Extract Logos: Enable/disable logo detection (default: true)
  • Extract Colors: Enable/disable color extraction (default: true)
  • Extract Metadata: Enable/disable metadata extraction (default: true)
  • Extract Social Links: Enable/disable social media link detection (default: true)

Request Configuration

  • Timeout: Request timeout in milliseconds (default: 30000)
  • Max Redirects: Maximum redirects to follow (default: 5)
  • User Agent: Custom user agent string

Output Options

  • Include Failed URLs: Include failed extractions in output (default: true)
  • Max Results: Maximum number of URLs to process (default: 1000)
  • Only Successful: Filter to only successful extractions (default: false)

Proxy Configuration

  • Proxy Configuration: Optional Apify proxy settings

Output Data

Each extracted result contains:

{
"url": "https://example.com",
"success": true,
"title": "Example Site",
"description": "An example website description",
"logo": [
{
"url": "https://example.com/favicon.ico",
"type": "favicon",
"sizes": "32x32"
}
],
"colors": {
"primary": "#007bff",
"secondary": "#6c757d"
},
"metadata": {
"keywords": "example, website",
"ogTitle": "Example Site",
"twitterCard": "summary"
},
"socialLinks": [
"https://twitter.com/example",
"https://facebook.com/example"
],
"timestamp": "2024-01-01T12:00:00.000Z"
}

For failed extractions, the result includes an error field with details about the failure.

Dataset Views

The Actor provides multiple views of the extracted data:

  • Overview: Basic results without error details
  • Detailed: Complete data including errors and metadata
  • Logo Assets: Individual logo entries for analysis
  • Failed Extractions: Only URLs that failed processing

Getting Started

Local Development

# Install dependencies
npm install
# Run locally
apify run
# Build for production
npm run build

Deploy to Apify

  1. Connect Git Repository:

  2. Push from Local Machine:

    # Login to Apify
    apify login
    # Deploy Actor
    apify push

Supported Platforms

Logo Sources

  • Favicon links
  • Apple touch icons
  • Open Graph images
  • Common logo selectors (alt/src/class containing "logo")

Social Media Platforms

  • Facebook
  • Twitter/X
  • Instagram
  • LinkedIn
  • YouTube
  • TikTok
  • Pinterest
  • Snapchat
  • GitHub
  • GitLab
  • BitBucket

Metadata Standards

  • Basic HTML meta tags
  • Open Graph protocol
  • Twitter Cards
  • Viewport and charset information

Error Handling

The Actor provides detailed error messages for common scenarios:

  • Network Issues: Domain not found, connection refused
  • HTTP Errors: 403 (forbidden), 404 (not found), 429 (rate limited), 5xx (server errors)
  • Timeouts: Request timeout handling
  • Invalid URLs: URL format validation

Technical Details

  • Framework: Built with Apify SDK and TypeScript
  • HTTP Client: Axios with comprehensive error handling
  • HTML Parser: Cheerio for server-side DOM manipulation
  • Rate Limiting: 1-second delay between requests for multiple URLs
  • URL Resolution: Automatic conversion of relative URLs to absolute URLs

Resources