Rockwell Full Catalog Crawler — Allen-Bradley SKUs avatar

Rockwell Full Catalog Crawler — Allen-Bradley SKUs

Pricing

from $0.05 / 1,000 results

Go to Apify Store
Rockwell Full Catalog Crawler — Allen-Bradley SKUs

Rockwell Full Catalog Crawler — Allen-Bradley SKUs

Discover Allen-Bradley catalog numbers from Rockwell family pages and the product-details API. Export deduplicated SKUs with breadcrumbs, lifecycle hints, and document counts. HTTP-only discovery for procurement and BOM workflows.

Pricing

from $0.05 / 1,000 results

Rating

0.0

(0)

Developer

Andrej Kiva

Andrej Kiva

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

6 hours ago

Last modified

Share

Crawloop Rockwell Automation Suite — Structured data extraction for Rockwell Automation and Allen-Bradley hardware catalog. Built for procurement teams, system integrators, and BOM engineering workflows.

Suite hub: github.com/PLCSPS-DEV/rockwell-automation

Product site: crawloop.com/rockwell-automation

DiscoveryEnrichmentDocumentsPDF parsing
Full Catalog CrawlerProduct ScraperDocument DownloaderDatasheet Parser
Lifecycle Tracker

Disclaimer: This is an unofficial integration developed independently of Rockwell Automation Inc. It is not affiliated with, sponsored by, or endorsed by Rockwell Automation Inc. or any of its subsidiaries.

Rockwell Automation, Allen-Bradley, and related names are trademarks of Rockwell Automation Inc. Product data is read from publicly accessible Rockwell web sources only; no proprietary databases are redistributed.

This Actor is provided for informational and research purposes only (e.g. procurement research, BOM audits, internal engineering workflows). You are solely responsible for ensuring your use complies with applicable laws, Rockwell website terms of use, and your organization's policies.

No warranty is given as to accuracy, completeness, or continued availability of third-party data. Use at your own risk.

Discover Allen-Bradley hardware catalog numbers by walking Rockwell family pages and paginating the Elasticsearch product-details servlet. Each result is fetched live from the Rockwell product API — no static SKU lists, no stale exports.

Use this Actor to build catalog number inventories for product families (ControlLogix, PowerFlex, GuardLogix), procurement databases, or as input for downstream enrichment with the Product Scraper and Lifecycle Tracker.

Unlike Siemens SiePortal, Rockwell has no global OneSearch API. This Actor replaces both a category crawler and a full-catalog keyword crawler by enumerating PIM category IDs from the sitemap.

When to use this Actor

Use the Full Catalog Crawler when you need to discover Allen-Bradley catalog numbers across the US hardware sitemap (~20k–50k unique SKUs after deduplication).

For full PDP specifications on a known catalog number list, use the Product Scraper. For bulk lifecycle screening only, use the Lifecycle Tracker.

Rockwell Automation Pipeline

Phase 1 — Discover SKUs Phase 2 — Screen & enrich Phase 3 — Documents & specs
───────────────────────── ───────────────────────── ─────────────────────────────
Full Catalog Crawler ──┐ ◄── you are here
├──► catalogNumber list ──► Lifecycle Tracker ──► Product Scraper
┌──────┴──────────────────────────────────────────────────────────────────┐
│ │
▼ ▼
Document Downloader Datasheet Parser
PDFs to Key-Value Store specs from TD PDFs

Key Features

  • Full sitemap discovery — Parses sitemap.content.xml for hardware family pages.
  • PIM category API crawl — Paginates /bin/rockwell-automation/product-details with app=RA.
  • Unique output — Each catalog number emitted once (deduplicateOutput, default true).
  • Resume checkpoints — Skip completed PIM queries on interrupted runs.
  • HTTP-only — No browser required; session warmup on products.html avoids 403 errors.

How It Works

  1. Warm up session on products.html.
  2. Load sitemap (or use provided family URLs / PIM IDs).
  3. Parse data-category-list from each family page.
  4. Paginate product-details API per unique PIM query set.
  5. Push deduplicated catalog records to the dataset.

Input Parameters

ParameterDescriptionDefault
discoveryModefull_sitemap, family_urls, or pim_ids.full_sitemap
startUrlsFamily page URLs (for family_urls mode).[]
pimCategoryIdsDirect PIM IDs (for pim_ids mode).[]
localeLocale path segment.en-us
countryCodeAPI country code.us
maxFamilyPagesCap family pages for testing (0 = all).3
numResultsPerPageAPI page size.24
concurrencyLimitParallel PIM queries.3
deduplicateOutputEmit each catalog number once.true
resumeFromCheckpointResume interrupted run.false

Input Example — smoke test

{
"discoveryMode": "pim_ids",
"pimCategoryIds": ["PIM_vfd/25b-powerflex-525,PIM_PF525"],
"maxFamilyPages": 0,
"deduplicateOutput": true
}

Input Example — full catalog

{
"discoveryMode": "full_sitemap",
"maxFamilyPages": 0,
"concurrencyLimit": 3,
"deduplicateOutput": true,
"resumeFromCheckpoint": true
}

Output Format

Each dataset record:

FieldDescription
catalogNumberAllen-Bradley catalog number
titleShort product title
descriptionProduct description
pdpUrlProduct detail page URL
lifecycleRaw lifecycle status from API
brandBrand name (typically Allen-Bradley)
seriesProduct series
documentCountNumber of linked documents
categoryPathBreadcrumb trail
pimCategoryIdsPIM category IDs that surfaced this SKU
sourceAlways product-details-api
{
"catalogNumber": "25B-E027N104",
"title": "PowerFlex 525 AC Drive",
"pdpUrl": "https://www.rockwellautomation.com/en-us/products/details.25B-E027N104.html",
"lifecycle": "ACTIVE",
"brand": "Allen-Bradley",
"series": "PowerFlex 525",
"documentCount": 5,
"categoryPath": ["Vfds Variable Frequency Drives", "Powerflex 525"],
"source": "product-details-api"
}

Typical Workflow

Full sitemap discovery
Full Catalog Crawler → deduplicated catalogNumber list with PDP URLs
Lifecycle Tracker → screen for obsolete parts
Product Scraper → full specs on selected SKUs

Actor Comparison

TaskFull Catalog CrawlerProduct ScraperLifecycle Tracker
Discover catalog numbersYesNoNo
Full specs and documentsNoYesNo
Bulk lifecycle screeningPartialNoYes
Replacement catalog numberNoPartialYes
HTTP-only (no browser)YesYesYes

Pricing

Pay-per-event billing. Each unique catalog number pushed to the dataset is billed automatically via the Store-configured event (no manual Actor.charge() in code).

Current Store setup (until custom event migration):

EventPrice
Actor start$0.05 per run
Dataset result (apify-default-dataset-item)$0.05 per row

Target setup (after Apify pricing unlock — configure in Publication tab):

EventPrice
Actor start$0.05 per run
Discovered SKU (catalog-product)$2.00 / 1,000 ($0.002 per unique catalog number)

When switching to catalog-product, disable apify-default-dataset-item in Publication to avoid double billing, then redeploy the Actor version that calls Actor.charge(event_name="catalog-product").

See ../../docs/rockwell_ppe_july2026.md for the full 18 July 2026 checklist.

Full sitemap mode can take several hours; all output is scraped live from Rockwell public sources.


Learn more: Product page · Suite hub · GitHub docs

Also from Crawloop Industrial: Siemens SiePortal Suite · GitHub docs