G2 Software Categories Scraper avatar

G2 Software Categories Scraper

Pricing

Pay per event

Go to Apify Store
G2 Software Categories Scraper

G2 Software Categories Scraper

Extract G2 software category names, URLs, parent categories, IDs, and hierarchy depth for SaaS market maps and competitor research.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Map the public G2 software category directory into clean dataset rows for SaaS market research, competitor mapping, sales planning, and category monitoring.

What does G2 Software Categories Scraper do?

G2 Software Categories Scraper extracts structured records from public G2 category pages.

It is designed for the public directory at https://www.g2.com/categories.

The actor returns category names, category URLs, parent categories, hierarchy depth, G2 category IDs, source URL, and scrape timestamp.

When a supplied category listing page exposes product links in server-rendered HTML, the actor also emits best-effort product rows.

Who is it for?

SaaS founders and product leaders

Use the scraper to map where your product category sits on G2, identify adjacent software markets, and find category pages that describe buyer intent in your niche.

Sales and revenue teams

Use category URLs as seed lists for account research, territory planning, partner discovery, and outbound segmentation by software market.

Marketing, SEO, and competitive intelligence teams

Monitor G2 category names, parent categories, and emerging software segments so your landing pages, comparison pages, and campaign language match how buyers research tools.

Analysts and data teams

Create repeatable software taxonomy snapshots that can be joined with CRM, vendor, keyword, product, or BI datasets without manually copying from G2.

Why use this actor?

G2 categories change over time.

Manual copying is slow and error-prone.

This actor produces repeatable JSON, CSV, Excel, and API-ready data.

It is lightweight because it uses HTTP and HTML parsing for the public directory.

It avoids browser overhead for category-map jobs.

What data can you extract?

FieldDescription
recordTypecategory, popular_category, or product
categoryNamePublic G2 category name
categorySlugURL slug from the G2 category URL
categoryUrlAbsolute G2 category URL
parentCategoryParent category shown by G2
hierarchyLevelDirectory nesting depth
categoryIdG2 category ID when present in page metadata
isNestedWhether the row is nested below another category
productNameProduct name for best-effort product rows
productUrlProduct profile URL for best-effort product rows
matchedSearchTermKeyword that matched a filtered run
sourceUrlPage that produced the record
scrapedAtISO scrape timestamp

How much does it cost to scrape G2 software categories?

This actor uses pay-per-event pricing with one small run-start fee and one low per-record charge for each saved dataset item.

Current pricing:

EventWhat it meansPrice
Run startedCharged once per run$0.005
Item extractedCharged for each category, popular category, or product rowfrom $0.000024512 per item on BRONZE; lower on higher tiers

Cost examples:

  • 100 records: about $0.0075 on BRONZE, including the start event.
  • 500 records: about $0.0173 on BRONZE, including the start event.
  • 1,000 records: about $0.0295 on BRONZE, including the start event.

Use maxItems to cap costs during testing. A typical public directory run can produce hundreds or thousands of category rows, so start with 100 records before scheduling larger market maps.

Quick start

  1. Open the actor on Apify.
  2. Keep the default start URL: https://www.g2.com/categories.
  3. Set maxItems to a small value such as 100 for the first run.
  4. Optionally add filters such as CRM, AI, Security, or HR.
  5. Run the actor.
  6. Export the dataset as JSON, CSV, Excel, XML, or via API.

Input configuration

startUrls

Use one or more G2 category URLs.

The recommended default is the full directory URL.

[{ "url": "https://www.g2.com/categories" }]

searchTerms

Use search terms to narrow the output.

For example, CRM matches categories and parent categories containing CRM.

["CRM", "Sales"]

maxItems

Set the maximum number of records to save.

Use 20 for smoke tests.

Use 100 or more for full category maps.

includePopularCategories

Enable this option to include featured category cards from the top of the page.

Disable it if you only want canonical table rows.

proxyConfiguration

Proxy use is optional.

The public directory often works without proxy.

Enable Apify Proxy only if G2 throttles your environment.

Example input

{
"startUrls": [{ "url": "https://www.g2.com/categories" }],
"searchTerms": ["CRM", "AI"],
"maxItems": 20,
"includePopularCategories": true,
"proxyConfiguration": { "useApifyProxy": false }
}

Example output

{
"recordType": "category",
"categoryName": "CRM Software",
"categorySlug": "crm",
"categoryUrl": "https://www.g2.com/categories/crm",
"parentCategory": "Sales Tools",
"hierarchyLevel": 0,
"categoryId": "179",
"isNested": false,
"productName": null,
"productUrl": null,
"rating": null,
"reviewCount": null,
"productDescription": null,
"matchedSearchTerm": "CRM",
"sourceUrl": "https://www.g2.com/categories",
"sourcePageTitle": "All Categories | G2",
"scrapedAt": "2026-06-26T00:00:00.000Z"
}

Category mapping workflows

Export all category rows to CSV.

Load them into a spreadsheet or BI tool.

Group by parentCategory.

Use hierarchyLevel to understand nested categories.

Join the results with your own vendor, keyword, or CRM data.

Lead generation workflows

Use category URLs as seed lists.

Prioritize categories that match your ICP.

Send selected category URLs to downstream enrichment or product-profile scrapers.

Track new category names that indicate emerging buyer intent.

Market research workflows

Run the actor on a schedule.

Compare category snapshots over time.

Detect newly added AI, security, finance, HR, or vertical software categories.

Use parent-category groupings to size adjacent markets.

Integrations

Send results to Google Sheets for analyst review.

Send category URLs to a CRM enrichment pipeline.

Use Apify webhooks to trigger downstream processing after each run.

Export JSON to a data warehouse.

Feed category names into keyword research tools.

API usage

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/g2-software-categories-scraper').call({
startUrls: [{ url: 'https://www.g2.com/categories' }],
searchTerms: ['CRM'],
maxItems: 100,
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
import os
client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/g2-software-categories-scraper').call(run_input={
'startUrls': [{'url': 'https://www.g2.com/categories'}],
'searchTerms': ['AI'],
'maxItems': 100,
})
print(run['defaultDatasetId'])

cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~g2-software-categories-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"startUrls":[{"url":"https://www.g2.com/categories"}],"maxItems":100}'

MCP usage

Use this actor from Apify MCP with Claude Code or Claude Desktop.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper

Claude Code setup:

$claude mcp add apify-g2-categories --transport http https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper

Claude Desktop JSON config:

{
"mcpServers": {
"apify-g2-categories": {
"url": "https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper"
}
}
}

Example prompts:

  • "Run the G2 Software Categories Scraper for CRM categories and summarize parent categories."
  • "Get 200 G2 category rows and list AI-related category names."
  • "Export G2 category URLs I can use for SaaS competitor research."

Tips for best results

Start with the public directory.

Keep maxItems low during the first run.

Use filters when you only need one market.

Disable popular category cards when you need only canonical directory rows.

Avoid aggressive repeated runs against G2.

Troubleshooting

Why did a category page produce no product rows?

G2 may not serve product listings in static HTML for that page or may throttle category listing pages.

The actor still works for the public directory category map.

Why did I get fewer rows than expected?

Check maxItems and searchTerms.

A restrictive filter can intentionally return a small dataset.

Should I enable proxy?

Only enable proxy if your run is throttled.

For the public directory, no proxy is usually cheaper and sufficient.

Data freshness

Each run fetches live public pages.

The actor does not use a cached G2 category database.

Schedule recurring runs if you need change detection.

Limitations

The actor does not log in to G2.

It does not bypass paywalls, private dashboards, or account-only data.

Product detail extraction is best effort and depends on what G2 serves in the initial HTML.

Review text scraping is out of scope for this category-directory MVP.

Legality

This actor extracts publicly available web pages.

Use the data responsibly.

Respect G2 terms, robots guidance, privacy laws, and your own compliance requirements.

Do not use scraped data for spam or unlawful profiling.

Explore other Automation Lab actors for SaaS, review, and lead-generation workflows:

FAQ

Can I scrape all G2 categories?

Yes. Use the default directory URL and set maxItems high enough.

Can I scrape only AI categories?

Yes. Add AI to searchTerms.

Can I scrape only CRM categories?

Yes. Add CRM to searchTerms.

Does it use a browser?

No. It uses HTTP and Cheerio for a lighter, cheaper category-directory workflow.

Does it scrape reviews?

No. This actor focuses on the category taxonomy and category URLs.

Can I use the output as seeds for another scraper?

Yes. The categoryUrl field is designed for downstream workflows.

Changelog

Initial version extracts the public G2 category directory, featured category cards, hierarchy metadata, and source audit fields.