Website Services Finder avatar

Website Services Finder

Pricing

$20.00/month + usage

Go to Apify Store
Website Services Finder

Website Services Finder

Automatically extract and analyze company services from any business website using advanced AI. Choose from 5 AI providers and 23+ models to intelligently identify and categorize a company’s offered services — perfect for lead generation, market research, and competitive analysis.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

Rigel Bytes

Rigel Bytes

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

1

Monthly active users

7 days ago

Last modified

Share

Automatically extract and analyze company services from any business website using advanced AI. Choose from 5 AI providers (OpenAI, Anthropic, Groq, Gemini, Hugging Face) and 23+ models to intelligently categorize and extract comprehensive service listings — perfect for lead generation, market research, competitive analysis, and business intelligence.

Flexible pricing based on your chosen AI provider — from free tiers to premium models.

Features

  • Multiple AI Providers

    • Groq - Fast and cost-effective (Llama 3.3, Mixtral, Gemma)
    • OpenAI - Industry-leading models (GPT-4o, GPT-4o-mini, GPT-3.5-turbo)
    • Anthropic - Advanced Claude models (Claude 3.5 Sonnet, Haiku, Opus)
    • Google Gemini - Latest Google AI (Gemini 1.5 Pro/Flash, 2.0 Flash)
    • Hugging Face - Open-source models (Llama, Mistral, Phi-3, Qwen)
  • AI-Powered Service Extraction

    • Automatic service categorization from company websites
    • Comprehensive service listings extracted via AI
    • Support for 23+ different AI models
    • Custom prompt support for tailored extraction
    • AI-powered service grouping - Automatically group companies by similar services using semantic similarity
    • Handles 1000s of websites efficiently with smart clustering
  • Detailed Business Information

    • Extract services from any company website
    • Flexible URL input (single or multiple websites)
    • Configurable crawl limits
    • Custom AI prompts for specific extraction needs
  • Flexible Configuration

    • Choose your preferred AI provider and model
    • Bring your own API keys for all providers
    • Support for Hugging Face inference providers (Cerebras, Together AI, Cohere, etc.)
    • Proxy configuration support
  • Fast and reliable scraping with retry mechanisms

  • Supports proxies for anonymity and bypassing restrictions

Pricing

  • Pay As You Go: Based on your usage and chosen AI provider
  • Groq offers generous free tier for cost-effective scraping
  • OpenAI, Anthropic, Gemini, and Hugging Face - competitive pricing

Input

The actor accepts the following input parameters:

Required Parameters

  • urls (array[string], required): List of company website URLs to extract services from

    • Example: ["https://www.gotomyerp.com/", "https://www.example.com/"]
  • provider (string, required): AI provider to use for service extraction

    • Options: "groq", "openai", "anthropic", "gemini", "huggingface"
    • Default: "openai"

Optional Parameters

  • custom_prompt (string, optional): Custom prompt to guide the AI in extracting services

    • Use this to tailor the extraction to your specific needs
  • model_name (string, optional): Specific AI model to use

    • Groq Models:
      • llama-3.3-70b-versatile, llama-3.1-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768, gemma2-9b-it
    • OpenAI Models:
      • gpt-4o (default), gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo
    • Anthropic Models:
      • claude-3-5-sonnet-latest, claude-3-5-haiku-latest, claude-3-opus-latest, claude-3-sonnet-20240229, claude-3-haiku-20240307
    • Gemini Models:
      • gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-exp
    • Hugging Face Models:
      • meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3.1-8B-Instruct, mistralai/Mistral-7B-Instruct-v0.3, microsoft/Phi-3-mini-4k-instruct, Qwen/Qwen2.5-7B-Instruct
  • provider_name (string, optional): Hugging Face inference provider name

  • maxRequestsPerCrawl (integer, optional): Maximum number of pages to scrape

    • Default: 100
    • Set to 0 for unlimited
  • groupServices (boolean, optional): Group companies by similar services after scraping

    • Default: false
    • Uses AI embeddings to identify semantically similar services (e.g., "QuickBooks Hosting" = "QuickBooks Cloud Hosting")
    • Results saved to key-value store as GROUPED_SERVICES and GROUPED_SERVICES_SUMMARY
  • similarityThreshold (number, optional): Threshold for grouping similar services

    • Default: 0.85
    • Range: 0.70-0.95
    • Higher values = stricter matching
    • Recommended: 0.85 for balanced grouping
  • proxyConfiguration (object, optional): Proxy settings for web scraping

    • Default: {"useApifyProxy": false}

API Keys (Required based on provider)

  • openai_api_key (string): Your OpenAI API key (required if using OpenAI)
  • anthropic_api_key (string): Your Anthropic API key (required if using Anthropic)
  • groq_api_key (string): Your Groq API key (required if using Groq)
  • gemini_api_key (string): Your Google Gemini API key (required if using Gemini)
  • huggingface_api_key (string): Your Hugging Face API key (required if using Hugging Face)

📝 Input Examples:

Using OpenAI (GPT-4o):

{
"urls": ["https://www.gotomyerp.com/"],
"provider": "openai",
"model_name": "gpt-4o",
"openai_api_key": "sk-your-api-key-here",
"maxRequestsPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": false
}
}

Using Groq (Llama 3.3 - Fast & Free):

{
"urls": ["https://www.gotomyerp.com/"],
"provider": "groq",
"model_name": "llama-3.3-70b-versatile",
"groq_api_key": "gsk_your-api-key-here",
"maxRequestsPerCrawl": 100
}

Using Anthropic (Claude 3.5 Sonnet):

{
"urls": ["https://www.gotomyerp.com/"],
"provider": "anthropic",
"model_name": "claude-3-5-sonnet-latest",
"anthropic_api_key": "sk-ant-your-api-key-here",
"maxRequestsPerCrawl": 100
}

Using Hugging Face (Llama 3):

{
"urls": ["https://www.gotomyerp.com/"],
"provider": "huggingface",
"model_name": "meta-llama/Meta-Llama-3-8B-Instruct",
"huggingface_api_key": "hf_your-api-key-here",
"provider_name": "nebius",
"maxRequestsPerCrawl": 100
}

With Custom Prompt:

{
"urls": ["https://www.gotomyerp.com/"],
"provider": "openai",
"model_name": "gpt-4o-mini",
"openai_api_key": "sk-your-api-key-here",
"custom_prompt": "Extract only the main service categories, not subcategories. Focus on the primary business offerings.",
"maxRequestsPerCrawl": 50
}

Why Choose This Scraper?

  • Affordable: Unlimited scraping for just $30/month.
  • Comprehensive: Extracts business details and AI-analyzed services.
  • Easy to Use: Simple setup and integration with the Apify platform.
  • Reliable: Built with retry mechanisms to handle network issues.
  • AI-Powered: Multiple AI providers for intelligent service extraction.

  • Shifter

    • Reliable residential proxies all over the world.
    • Cheap rates
    • Order Shifter Now
    • Get 10% Off any product, use coupan rigelbytes-YoBB.
  • OxyLabs

    • 100M+ Proxies
    • Fastest proxies in the market
    • Real profile, human-like Residential IPs
    • Quality assurance framework for most reliable IPs
    • Get Proxies
  • DataImpulse

Learn More About Proxies

  • Exclusive Deals: Some providers may offer special discounts or bonuses when you use our link.
  • Support Our Work: Each purchase helps us maintain and improve the tools and services we provide.
  • No Extra Cost: You pay the same price, but part of it goes to supporting our efforts.

Running via Apify Console

You can run this actor from the Apify Console by providing the necessary input parameters.

Running via API

You can trigger this actor using the Apify API, passing the required input in the request body.

API Request Example (Python)

from apify_client import ApifyClient
# Initialize the ApifyClient with your API token
client = ApifyClient("<YOUR_API_TOKEN>")
# Prepare the Actor input
run_input = {
"openai_api_key": "",
"urls": [
"https://www.gotomyerp.com/"
],
"provider": "openai",
"model_name": "gpt-4o",
"maxRequestsPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": false
}
}
# Run the Actor and wait for it to finish
run = client.actor("rigelbytes/website-services-finder").call(run_input=run_input)

JavaScript

import { ApifyClient } from 'apify-client';
// Initialize the ApifyClient with your API token
const client = new ApifyClient({
token: '<YOUR_API_TOKEN>',
});
// Prepare Actor input
const input = {
"openai_api_key": "",
"urls": [
"https://www.gotomyerp.com/"
],
"provider": "openai",
"model_name": "gpt-4o",
"maxRequestsPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": false
}
};
(async () => {
// Run the Actor and wait for it to finish
const run = await client.actor("rigelbytes/website-services-finder").call(input);
})();

Running with cURL

# Set API token
API_TOKEN=<YOUR_API_TOKEN>
# Prepare Actor input
cat > input.json <<'EOF'
{
"openai_api_key": "",
"urls": [
"https://www.gotomyerp.com/"
],
"provider": "openai",
"model_name": "gpt-4o",
"maxRequestsPerCrawl": 100,
"proxyConfiguration": {
"useApifyProxy": false
}
}
EOF
# Run the Actor
curl "https://api.apify.com/v2/acts/rigelbytes/website-services-finder/runs?token=$API_TOKEN" \
-X POST \
-d @input.json \
-H 'Content-Type: application/json'

Service Grouping Feature

This actor includes an AI-powered service grouping feature that automatically identifies and groups companies offering similar services, even when those services have different names.

How It Works

  1. Semantic Analysis: Uses AI embeddings to understand the meaning of each service
  2. Clustering: Groups semantically similar services together (e.g., "QuickBooks Hosting" and "QuickBooks Cloud Hosting")
  3. Smart Grouping: Creates clusters of companies offering the same or similar services
  4. Scalable: Efficiently handles 1000s of websites and services

Enable Service Grouping

Add these parameters to your input:

{
"urls": ["https://www.gotomyerp.com/", "https://aaeasy.com/"],
"provider": "openai",
"model_name": "gpt-4o",
"openai_api_key": "sk-your-api-key-here",
"groupServices": true,
"similarityThreshold": 0.85
}

Output

When grouping is enabled, results are saved to the key-value store:

GROUPED_SERVICES - Full grouping with matching services:

{
"QuickBooks Hosting": [
{
"website": "https://gotomyerp.com",
"matching_services": ["QuickBooks Hosting", "QuickBooks Cloud Hosting"]
},
{
"website": "https://aaeasy.com",
"matching_services": ["QuickBooks Hosting"]
}
],
"Bookkeeping Services": [
{
"website": "https://12baraccounting.com",
"matching_services": ["Daily bookkeeping", "Weekly bookkeeping"]
}
]
}

GROUPED_SERVICES_SUMMARY - Summary with company counts:

{
"total_service_groups": 15,
"groups": {
"QuickBooks Hosting": {
"company_count": 2,
"companies": ["https://gotomyerp.com", "https://aaeasy.com"]
},
"Bookkeeping Services": {
"company_count": 1,
"companies": ["https://12baraccounting.com"]
}
}
}

Similarity Threshold Guide

  • 0.95 - Very strict (only nearly identical services grouped)
  • 0.85 - Balanced (recommended - groups similar services)
  • 0.75 - Lenient (broader grouping, may include loosely related services)
  • 0.70 - Very lenient (groups many related services together)

View Detailed Data

🚀 Other Tools by Rigel Bytes

Airbnb Images Downloader
A focused Airbnb image scraper that extracts all photos from an Airbnb listing page and packages the listing's image files into a compressed archive. The Act...

Zillow Scraper
A Zillow-focused web scraper that extracts structured property listing data for real estate analysis and monitoring. The Actor crawls Zillow listings to coll...

Zillow Detail Scraper
Zillow scraper with customizable proxy support. Extract comprehensive property data, including pricing, images, and location details, using your proxies for better control and efficiency. Check the recommended proxy providers below.

Daraz
A web scraping Actor that extracts product listings and detailed product and seller data from Daraz.pk (Pakistan ecommerce marketplace) for monitoring and an...

Airbnb Listing
A web-scraping Actor that bulk-extracts structured Airbnb listing data from listing pages: it retrieves listing metadata, descriptive content, property featu...

Google Maps Scraper
A Google Maps scraping Actor that extracts structured business profiles and local place intelligence at scale: it crawls Google Maps listings to collect busi...

Airbnb Availability Calendar
Exports Airbnb listing availability calendars by scraping listing pages and producing structured, per-date calendar data. The Actor extracts date entries and...

Instagram Profile Scraper
A web-scraping Actor that extracts detailed Instagram profile and media metadata from profile pages: it retrieves profile-level metrics (follower/following c...

Land.com Scraper
A Land.com-focused web scraper that crawls Land.com property listings and extracts structured real estate data for specified geographic areas and listing typ...

Airbnb Reviews
A web-scraping Actor that extracts unlimited reviews from Airbnb listings: it crawls listing review pages and produces structured review records containing r...

FurnishedFinder
A web-scraping Actor designed to extract large-scale furnished rental data from Furnished Finder: it crawls listing pages to collect listing metadata (proper...

Immobilienscout24
A scraper for immobilienscout24.de that extracts large-scale real estate listing data and contact information, optimized for bulk property data collection, i...

Airbnb Listing Urls
A web scraper that extracts unlimited Airbnb listings from search queries and returns structured listing metadata for analysis. The Actor crawls Airbnb searc...

Instagram Engagement Tool
Analyzes public Instagram profiles by scraping recent posts and profile metadata to compute engagement metrics for images and videos. The Actor extracts prof...

Instagram Post Scraper
An Instagram scraping Actor that extracts every post from Instagram profiles and produces structured post-level metadata and media assets for social media an...

BBB Scraper
A web-scraping Actor that extracts detailed business listings from the Better Business Bureau (BBB). It crawls BBB listings and produces structured business ...

Linkedin Company Scraper
A LinkedIn Company Scraper that extracts structured company profile data from LinkedIn company pages for lead generation and competitive intelligence. The Ac...

linkedin-company-details
A LinkedIn Company Scraper for extracting comprehensive company profiles and social content from LinkedIn: it scrapes and returns structured company profile ...

Instagram Reel Scraper
A web-scraping Actor that extracts all Instagram Reels from public profiles and produces structured reel-level metadata for content analysis, research, and m...

Rottentomatoes Reviews Scraper
A web-scraping Apify Actor that extracts user and critic reviews from Rotten Tomatoes pages, producing structured review records for analysis. It crawls unli...

Extract Furnished Finder Hosts
Scrapes Furnished Finder listings and extracts structured listing and host profile data for furnished rentals, including listing titles, property types, geol...

Trustpilot Reviews Scraper
A Trustpilot review scraper that collects structured review records and full reviewer profile data at scale. It extracts review content (titles and body text...

Furnished Finder Fast
An Apify Actor for scraping Furnished Finder rental listings and optional host profiles, extracting structured listing data such as photos, textual descripti...

Zillow Agents
A Zillow agent profile scraper that extracts structured agent data from Zillow for location-based queries (city, neighborhood, ZIP). The Actor scrapes agent ...

Bayut Scraper
A web-scraping Actor for extracting structured property listings and market intelligence from Bayut.com (UAE). It crawls Bayut property pages and returns str...

dubai-listing-scraper
A web scraping Actor for Bayut.com that extracts structured UAE property listing data for sale and rent across Dubai, Abu Dhabi and other Emirates. It progra...

Tiktok Comment Scraper
A TikTok comment scraping Actor that extracts all comments from TikTok videos given a video URL. It collects commenter metadata (usernames, display names, pr...

Tiktok Engagement Rate
A TikTok profile scraper that analyzes public TikTok accounts to extract profile metadata and recent video-level interaction metrics and to compute engagemen...

Company Service Finder
Company Service Finder scrapes business listings from Google Search and Google Maps across cities and states, extracts company websites, names, phone numbers...

Airbnb Address Finder
A web-scraping Apify Actor that bulk-extracts Airbnb listing addresses and comprehensive listing metadata from provided Airbnb listing URLs. It parses listin...

Immowelt Scraper
An Apify Actor that scrapes unlimited real estate listings from immowelt.de and returns structured property data for indexing and analysis. The Actor extract...

Propertyfinder Scraper
A scraper for propertyfinder.ae that extracts unlimited real estate listings and related metadata at scale. Implements asynchronous concurrency, automatic re...

Publix Scraper
A web scraping Actor for Publix.com that extracts grocery product data from Publix collection pages using a collection URL and a delivery/pickup location to ...

Redfin Scraper
Redfin Scraper extracts large-scale real estate listing data from Redfin search and city pages and returns structured property records for downstream analysi...

Instacart Scraper
A web-scraping Actor that extracts structured product data from instacart.com using a collection/search keyword combined with a delivery or pickup location; ...

Homedepot Scraper
A web scraper for homedepot.com that crawls collection pages and performs location-aware scraping (delivery or pickup) to extract structured product data. Th...

Doctify Scraper
A web scraping Actor that extracts structured healthcare provider and practice data from doctify.com starting from a Doctify search-results URL. The Actor ha...

Facebook Ads Scraper
Extracts structured ad data from the Facebook Ads Library using a search URL: scrapes ad metadata and creative assets (ad copy, titles, captions), destinatio...

Ticketmaster Scraper
A Ticketmaster-focused web scraper that extracts structured event metadata by location and date range for event monitoring, market research, and entertainmen...

Scrape Instagram Creators
Scrape Instagram Creators is an Instagram profile and media scraper that extracts detailed creator profile metadata and media-level information. It captures ...

Immowelt Property Scraper
A scraper for immowelt.de (Germany) that harvests large volumes of real estate listings and returns structured property records for analysis. It extracts lis...

Immobilienscout24-scraper
A web scraping Actor that extracts large volumes of real estate listings and contact information from immobilienscout24.de, focused on German property market...

Instagram Creator Stats
A scraping and analytics Actor that extracts Instagram profile metadata and per-post media details to compute engagement metrics for creator/influencer analy...

Etsy Scraper
A web-scraping Actor that extracts structured product data from Etsy.com from category pages, search results, or individual product list URLs. It crawls list...

Rightmove Scraper
Scrapes Rightmove.co.uk property listings and returns structured property records for real estate data extraction and analysis. The Actor crawls Rightmove se...

Outdoorsy Scraper
A web scraping Actor that extracts rental listing data from outdoorsy.com search result pages. It processes multiple search result URLs in a single run, issu...

Instagram Comment Scraper
A web scraping Actor that extracts all comments from Instagram posts and reels given their URLs; it captures commenter metadata (username, display name, prof...

Understanding Proxies:

When scraping data or browsing anonymously, proxies are essential. They act as intermediaries, masking your original IP address and allowing you to send requests from another location.

Why Use Proxies?

  • Avoid IP Blocks: By routing requests through proxies, you prevent the target website from recognizing your IP as a scraper or spammer.
  • Access Geo-restricted Content: Proxies let you access content or websites restricted by location.
  • Enhance Anonymity: Hide your actual IP, ensuring privacy while scraping or browsing.

Types of Proxies

  1. Residential Proxies
    • Real IP addresses provided by ISPs to home users.
    • They mimic regular users, making them harder to detect.
    • Best for: Long-term, undetectable scraping, and avoiding blocks.
  2. Data Center Proxies
    • IP addresses from servers in data centers.
    • Faster and cheaper than residential proxies but easier to detect and block.
    • Best for: High-speed scraping, but with a higher risk of detection.
  3. Mobile Proxies
    • IPs provided by mobile carriers (3G/4G/5G networks).
    • Very difficult to detect, as they appear as regular mobile users.
    • Best for: Mobile-related scraping or avoiding sophisticated blocks.

Rotating Proxies vs. Straight Proxies

  • Rotating Proxies: Every request you send goes through a different proxy, making it harder for websites to detect patterns.
  • Straight Proxies: All requests are sent through the same proxy, making it easier to track your IP.

About Rigel Bytes

Rigel Bytes specializes in web scraping, automation, and data analytics. We help businesses extract and leverage valuable data for informed decision-making.

Contact Us

Ready to unlock the power of data? Reach out to us at (contact@rigelbytes.com) or book an appointment with us to learn more about how we can help you achieve your data goals.

Detailed Data

[
{
"website": "https://www.gotomyerp.com/",
"services": [
"ERP Hosting",
"QuickBooks Hosting",
"Sage Cloud Hosting",
"Sage 50 Cloud Hosting",
"Sage 100 Cloud Hosting",
"Sage 300 Cloud Hosting",
"GovCloud Hosting",
"Business Apps",
"ERP Software",
"Sage Intacct",
"Sage 100",
"Sage 300",
"Managed Server Hosting",
"Private Server Hosting",
"Application Hosting",
"Custom Integrations",
"Sage Consulting Services",
"Sage Authorized Reseller",
"QuickBooks Authorized Reseller",
"Partner Program",
"Security",
"Compliance",
"Continuity Plans",
"Disaster Recovery",
"Identity Access Management",
"MultiFactor Authentication"
]
}
]