G2 Software Categories Scraper
Pricing
Pay per event
G2 Software Categories Scraper
Extract G2 software category names, URLs, parent categories, IDs, and hierarchy depth for SaaS market maps and competitor research.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Map the public G2 software category directory into clean dataset rows for SaaS market research, competitor mapping, sales planning, and category monitoring.
What does G2 Software Categories Scraper do?
G2 Software Categories Scraper extracts structured records from public G2 category pages.
It is designed for the public directory at https://www.g2.com/categories.
The actor returns category names, category URLs, parent categories, hierarchy depth, G2 category IDs, source URL, and scrape timestamp.
When a supplied category listing page exposes product links in server-rendered HTML, the actor also emits best-effort product rows.
Who is it for?
SaaS founders and product leaders
Use the scraper to map where your product category sits on G2, identify adjacent software markets, and find category pages that describe buyer intent in your niche.
Sales and revenue teams
Use category URLs as seed lists for account research, territory planning, partner discovery, and outbound segmentation by software market.
Marketing, SEO, and competitive intelligence teams
Monitor G2 category names, parent categories, and emerging software segments so your landing pages, comparison pages, and campaign language match how buyers research tools.
Analysts and data teams
Create repeatable software taxonomy snapshots that can be joined with CRM, vendor, keyword, product, or BI datasets without manually copying from G2.
Why use this actor?
G2 categories change over time.
Manual copying is slow and error-prone.
This actor produces repeatable JSON, CSV, Excel, and API-ready data.
It is lightweight because it uses HTTP and HTML parsing for the public directory.
It avoids browser overhead for category-map jobs.
What data can you extract?
| Field | Description |
|---|---|
recordType | category, popular_category, or product |
categoryName | Public G2 category name |
categorySlug | URL slug from the G2 category URL |
categoryUrl | Absolute G2 category URL |
parentCategory | Parent category shown by G2 |
hierarchyLevel | Directory nesting depth |
categoryId | G2 category ID when present in page metadata |
isNested | Whether the row is nested below another category |
productName | Product name for best-effort product rows |
productUrl | Product profile URL for best-effort product rows |
matchedSearchTerm | Keyword that matched a filtered run |
sourceUrl | Page that produced the record |
scrapedAt | ISO scrape timestamp |
How much does it cost to scrape G2 software categories?
This actor uses pay-per-event pricing with one small run-start fee and one low per-record charge for each saved dataset item.
Current pricing:
| Event | What it means | Price |
|---|---|---|
| Run started | Charged once per run | $0.005 |
| Item extracted | Charged for each category, popular category, or product row | from $0.000024512 per item on BRONZE; lower on higher tiers |
Cost examples:
- 100 records: about $0.0075 on BRONZE, including the start event.
- 500 records: about $0.0173 on BRONZE, including the start event.
- 1,000 records: about $0.0295 on BRONZE, including the start event.
Use maxItems to cap costs during testing. A typical public directory run can produce hundreds or thousands of category rows, so start with 100 records before scheduling larger market maps.
Quick start
- Open the actor on Apify.
- Keep the default start URL:
https://www.g2.com/categories. - Set
maxItemsto a small value such as 100 for the first run. - Optionally add filters such as
CRM,AI,Security, orHR. - Run the actor.
- Export the dataset as JSON, CSV, Excel, XML, or via API.
Input configuration
startUrls
Use one or more G2 category URLs.
The recommended default is the full directory URL.
[{ "url": "https://www.g2.com/categories" }]
searchTerms
Use search terms to narrow the output.
For example, CRM matches categories and parent categories containing CRM.
["CRM", "Sales"]
maxItems
Set the maximum number of records to save.
Use 20 for smoke tests.
Use 100 or more for full category maps.
includePopularCategories
Enable this option to include featured category cards from the top of the page.
Disable it if you only want canonical table rows.
proxyConfiguration
Proxy use is optional.
The public directory often works without proxy.
Enable Apify Proxy only if G2 throttles your environment.
Example input
{"startUrls": [{ "url": "https://www.g2.com/categories" }],"searchTerms": ["CRM", "AI"],"maxItems": 20,"includePopularCategories": true,"proxyConfiguration": { "useApifyProxy": false }}
Example output
{"recordType": "category","categoryName": "CRM Software","categorySlug": "crm","categoryUrl": "https://www.g2.com/categories/crm","parentCategory": "Sales Tools","hierarchyLevel": 0,"categoryId": "179","isNested": false,"productName": null,"productUrl": null,"rating": null,"reviewCount": null,"productDescription": null,"matchedSearchTerm": "CRM","sourceUrl": "https://www.g2.com/categories","sourcePageTitle": "All Categories | G2","scrapedAt": "2026-06-26T00:00:00.000Z"}
Category mapping workflows
Export all category rows to CSV.
Load them into a spreadsheet or BI tool.
Group by parentCategory.
Use hierarchyLevel to understand nested categories.
Join the results with your own vendor, keyword, or CRM data.
Lead generation workflows
Use category URLs as seed lists.
Prioritize categories that match your ICP.
Send selected category URLs to downstream enrichment or product-profile scrapers.
Track new category names that indicate emerging buyer intent.
Market research workflows
Run the actor on a schedule.
Compare category snapshots over time.
Detect newly added AI, security, finance, HR, or vertical software categories.
Use parent-category groupings to size adjacent markets.
Integrations
Send results to Google Sheets for analyst review.
Send category URLs to a CRM enrichment pipeline.
Use Apify webhooks to trigger downstream processing after each run.
Export JSON to a data warehouse.
Feed category names into keyword research tools.
API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/g2-software-categories-scraper').call({startUrls: [{ url: 'https://www.g2.com/categories' }],searchTerms: ['CRM'],maxItems: 100,});console.log(run.defaultDatasetId);
Python
from apify_client import ApifyClientimport osclient = ApifyClient(os.environ['APIFY_TOKEN'])run = client.actor('automation-lab/g2-software-categories-scraper').call(run_input={'startUrls': [{'url': 'https://www.g2.com/categories'}],'searchTerms': ['AI'],'maxItems': 100,})print(run['defaultDatasetId'])
cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~g2-software-categories-scraper/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"startUrls":[{"url":"https://www.g2.com/categories"}],"maxItems":100}'
MCP usage
Use this actor from Apify MCP with Claude Code or Claude Desktop.
MCP URL:
https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper
Claude Code setup:
$claude mcp add apify-g2-categories --transport http https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper
Claude Desktop JSON config:
{"mcpServers": {"apify-g2-categories": {"url": "https://mcp.apify.com/?tools=automation-lab/g2-software-categories-scraper"}}}
Example prompts:
- "Run the G2 Software Categories Scraper for CRM categories and summarize parent categories."
- "Get 200 G2 category rows and list AI-related category names."
- "Export G2 category URLs I can use for SaaS competitor research."
Tips for best results
Start with the public directory.
Keep maxItems low during the first run.
Use filters when you only need one market.
Disable popular category cards when you need only canonical directory rows.
Avoid aggressive repeated runs against G2.
Troubleshooting
Why did a category page produce no product rows?
G2 may not serve product listings in static HTML for that page or may throttle category listing pages.
The actor still works for the public directory category map.
Why did I get fewer rows than expected?
Check maxItems and searchTerms.
A restrictive filter can intentionally return a small dataset.
Should I enable proxy?
Only enable proxy if your run is throttled.
For the public directory, no proxy is usually cheaper and sufficient.
Data freshness
Each run fetches live public pages.
The actor does not use a cached G2 category database.
Schedule recurring runs if you need change detection.
Limitations
The actor does not log in to G2.
It does not bypass paywalls, private dashboards, or account-only data.
Product detail extraction is best effort and depends on what G2 serves in the initial HTML.
Review text scraping is out of scope for this category-directory MVP.
Legality
This actor extracts publicly available web pages.
Use the data responsibly.
Respect G2 terms, robots guidance, privacy laws, and your own compliance requirements.
Do not use scraped data for spam or unlawful profiling.
Related scrapers
Explore other Automation Lab actors for SaaS, review, and lead-generation workflows:
- Google Maps Lead Finder — enrich local and B2B prospect lists after you define target markets.
- Website Contact Finder — find emails and contact pages from company websites.
- Trustpilot Reviews Scraper — collect public review data for brand and competitor analysis.
- Capterra Scraper — compare G2 category maps with another software directory.
FAQ
Can I scrape all G2 categories?
Yes. Use the default directory URL and set maxItems high enough.
Can I scrape only AI categories?
Yes. Add AI to searchTerms.
Can I scrape only CRM categories?
Yes. Add CRM to searchTerms.
Does it use a browser?
No. It uses HTTP and Cheerio for a lighter, cheaper category-directory workflow.
Does it scrape reviews?
No. This actor focuses on the category taxonomy and category URLs.
Can I use the output as seeds for another scraper?
Yes. The categoryUrl field is designed for downstream workflows.
Changelog
Initial version extracts the public G2 category directory, featured category cards, hierarchy metadata, and source audit fields.