Rockwell Full Catalog Crawler — Allen-Bradley SKUs
Pricing
from $0.05 / 1,000 results
Rockwell Full Catalog Crawler — Allen-Bradley SKUs
Discover Allen-Bradley catalog numbers from Rockwell family pages and the product-details API. Export deduplicated SKUs with breadcrumbs, lifecycle hints, and document counts. HTTP-only discovery for procurement and BOM workflows.
Pricing
from $0.05 / 1,000 results
Rating
0.0
(0)
Developer
Andrej Kiva
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
6 hours ago
Last modified
Categories
Share
Crawloop Rockwell Automation Suite — Structured data extraction for Rockwell Automation and Allen-Bradley hardware catalog. Built for procurement teams, system integrators, and BOM engineering workflows.
Suite hub: github.com/PLCSPS-DEV/rockwell-automation
Product site: crawloop.com/rockwell-automation
| Discovery | Enrichment | Documents | PDF parsing |
|---|---|---|---|
| Full Catalog Crawler | Product Scraper | Document Downloader | Datasheet Parser |
| Lifecycle Tracker |
Disclaimer: This is an unofficial integration developed independently of Rockwell Automation Inc. It is not affiliated with, sponsored by, or endorsed by Rockwell Automation Inc. or any of its subsidiaries.
Rockwell Automation, Allen-Bradley, and related names are trademarks of Rockwell Automation Inc. Product data is read from publicly accessible Rockwell web sources only; no proprietary databases are redistributed.
This Actor is provided for informational and research purposes only (e.g. procurement research, BOM audits, internal engineering workflows). You are solely responsible for ensuring your use complies with applicable laws, Rockwell website terms of use, and your organization's policies.
No warranty is given as to accuracy, completeness, or continued availability of third-party data. Use at your own risk.
Discover Allen-Bradley hardware catalog numbers by walking Rockwell family pages and paginating the Elasticsearch product-details servlet. Each result is fetched live from the Rockwell product API — no static SKU lists, no stale exports.
Use this Actor to build catalog number inventories for product families (ControlLogix, PowerFlex, GuardLogix), procurement databases, or as input for downstream enrichment with the Product Scraper and Lifecycle Tracker.
Unlike Siemens SiePortal, Rockwell has no global OneSearch API. This Actor replaces both a category crawler and a full-catalog keyword crawler by enumerating PIM category IDs from the sitemap.
When to use this Actor
Use the Full Catalog Crawler when you need to discover Allen-Bradley catalog numbers across the US hardware sitemap (~20k–50k unique SKUs after deduplication).
For full PDP specifications on a known catalog number list, use the Product Scraper. For bulk lifecycle screening only, use the Lifecycle Tracker.
Rockwell Automation Pipeline
Phase 1 — Discover SKUs Phase 2 — Screen & enrich Phase 3 — Documents & specs───────────────────────── ───────────────────────── ─────────────────────────────Full Catalog Crawler ──┐ ◄── you are here├──► catalogNumber list ──► Lifecycle Tracker ──► Product Scraper│┌──────┴──────────────────────────────────────────────────────────────────┐│ │▼ ▼Document Downloader Datasheet ParserPDFs to Key-Value Store specs from TD PDFs
Key Features
- Full sitemap discovery — Parses
sitemap.content.xmlfor hardware family pages. - PIM category API crawl — Paginates
/bin/rockwell-automation/product-detailswithapp=RA. - Unique output — Each catalog number emitted once (
deduplicateOutput, defaulttrue). - Resume checkpoints — Skip completed PIM queries on interrupted runs.
- HTTP-only — No browser required; session warmup on
products.htmlavoids 403 errors.
How It Works
- Warm up session on
products.html. - Load sitemap (or use provided family URLs / PIM IDs).
- Parse
data-category-listfrom each family page. - Paginate product-details API per unique PIM query set.
- Push deduplicated catalog records to the dataset.
Input Parameters
| Parameter | Description | Default |
|---|---|---|
discoveryMode | full_sitemap, family_urls, or pim_ids. | full_sitemap |
startUrls | Family page URLs (for family_urls mode). | [] |
pimCategoryIds | Direct PIM IDs (for pim_ids mode). | [] |
locale | Locale path segment. | en-us |
countryCode | API country code. | us |
maxFamilyPages | Cap family pages for testing (0 = all). | 3 |
numResultsPerPage | API page size. | 24 |
concurrencyLimit | Parallel PIM queries. | 3 |
deduplicateOutput | Emit each catalog number once. | true |
resumeFromCheckpoint | Resume interrupted run. | false |
Input Example — smoke test
{"discoveryMode": "pim_ids","pimCategoryIds": ["PIM_vfd/25b-powerflex-525,PIM_PF525"],"maxFamilyPages": 0,"deduplicateOutput": true}
Input Example — full catalog
{"discoveryMode": "full_sitemap","maxFamilyPages": 0,"concurrencyLimit": 3,"deduplicateOutput": true,"resumeFromCheckpoint": true}
Output Format
Each dataset record:
| Field | Description |
|---|---|
catalogNumber | Allen-Bradley catalog number |
title | Short product title |
description | Product description |
pdpUrl | Product detail page URL |
lifecycle | Raw lifecycle status from API |
brand | Brand name (typically Allen-Bradley) |
series | Product series |
documentCount | Number of linked documents |
categoryPath | Breadcrumb trail |
pimCategoryIds | PIM category IDs that surfaced this SKU |
source | Always product-details-api |
{"catalogNumber": "25B-E027N104","title": "PowerFlex 525 AC Drive","pdpUrl": "https://www.rockwellautomation.com/en-us/products/details.25B-E027N104.html","lifecycle": "ACTIVE","brand": "Allen-Bradley","series": "PowerFlex 525","documentCount": 5,"categoryPath": ["Vfds Variable Frequency Drives", "Powerflex 525"],"source": "product-details-api"}
Typical Workflow
Full sitemap discovery│▼Full Catalog Crawler → deduplicated catalogNumber list with PDP URLs│▼Lifecycle Tracker → screen for obsolete parts│▼Product Scraper → full specs on selected SKUs
Actor Comparison
| Task | Full Catalog Crawler | Product Scraper | Lifecycle Tracker |
|---|---|---|---|
| Discover catalog numbers | Yes | No | No |
| Full specs and documents | No | Yes | No |
| Bulk lifecycle screening | Partial | No | Yes |
| Replacement catalog number | No | Partial | Yes |
| HTTP-only (no browser) | Yes | Yes | Yes |
Pricing
Pay-per-event billing. Each unique catalog number pushed to the dataset is billed automatically via the Store-configured event (no manual Actor.charge() in code).
Current Store setup (until custom event migration):
| Event | Price |
|---|---|
| Actor start | $0.05 per run |
Dataset result (apify-default-dataset-item) | $0.05 per row |
Target setup (after Apify pricing unlock — configure in Publication tab):
| Event | Price |
|---|---|
| Actor start | $0.05 per run |
Discovered SKU (catalog-product) | $2.00 / 1,000 ($0.002 per unique catalog number) |
When switching to catalog-product, disable apify-default-dataset-item in Publication to avoid double billing, then redeploy the Actor version that calls Actor.charge(event_name="catalog-product").
See ../../docs/rockwell_ppe_july2026.md for the full 18 July 2026 checklist.
Full sitemap mode can take several hours; all output is scraped live from Rockwell public sources.
Learn more: Product page · Suite hub · GitHub docs
Also from Crawloop Industrial: Siemens SiePortal Suite · GitHub docs