Pricing

from $9.00 / 1,000 results

Go to Apify Store

Albertsons Product Scraper

Try for free

Pricing

from $9.00 / 1,000 results

Rating

0.0

(0)

Developer

GetDataForMe

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Albertsons Crawler for Foodgraph

A professional-grade web scraper built with Crawlee and Playwright for extracting product data from Albertsons.com. This scraper is designed to meet Foodgraph's specific requirements for grocery product data collection.

Features

Full Category Coverage: Scrapes all specified product categories with inclusion/exclusion rules
Browser-based Scraping: Uses real browser automation for reliable data extraction
API Interception: Captures and reuses Albertsons API calls for efficient data collection
Session Management: Automatic session refresh and token management
GTIN/UPC Validation: Ensures all products have valid GTIN/UPC codes
Structured Output: Produces data in Foodgraph's required format with rid, sourcePdpUrl, and product fields
Proxy Support: Compatible with Bright Data, Apify Proxy, and custom proxy solutions
Health Monitoring: Built-in health checker for daily validation
Error Handling: Robust retry logic and exponential backoff

Quick Start

Installation
```
$npm install
```
Basic Usage
```
$npm start
```
Development Mode
```
$npm run start:dev
```

Configuration

Default Categories (Foodgraph Test Project)

The scraper is pre-configured with the exact categories specified in the Foodgraph RFP:

Include All Categories:

Beverages
Breakfast & Cereal
Canned Goods & Soups
Condiments, Spice & Bake
Cookies, Snacks & Candy
Dairy, Eggs & Cheese
Frozen Foods
Fruits & Vegetables
Grains, Pasta & Sides
International Cuisine
Meat & Seafood

Include Specific Subcategories Only:

Baby Care → Formula & Baby Food only
Wine, Beer & Spirits → Non-Alcoholic Beer and Cocktail Mixes only

Exclude Specific Subcategories:

Bread & Bakery → Exclude Bakery Beverages & Snacks, Bakery Catering Trays
Deli → Exclude Deli Bar & Food Service, Deli Sandwiches and Wraps, Sushi

Input Parameters

{
  "startUrls": ["https://www.albertsons.com/shop/aisles/beverages.html"],
  "storeIds": [177, 154, 1680],
  "maxRequestsPerCrawl": 1000,
  "headless": true
}

Proxy Configuration

Bright Data (Recommended):

{
  "proxyConfiguration": {
    "proxyUrls": ["wss://brd-customer-hl_1548877d-zone-scraping_browser1-country-us:f6kbfntem9hn@brd.superproxy.io:9222"]
  }
}

Apify Proxy:

{
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

Output Format

The scraper produces data in the exact format required by Foodgraph:

{
  "rid": "550e8400-e29b-41d4-a716-446655440000",
  "sourcePdpUrl": "https://www.albertsons.com/product-detail/...",
  "product": {
    "fullCategoryTaxonomy": ["Beverages", "Water & Sparkling Water"],
    "id": "123456",
    "name": "Product Name",
    "upc": "123456789012",
    "brand": "Brand Name",
    "ingredients": "...",
    "nutrition": {...},
    "images": ["https://..."]
  }
}

Key Requirements Compliance

✅ Technology Stack

JavaScript: ✓ Built with Node.js and TypeScript
Playwright: ✓ Browser automation with Firefox support
Crawlee: ✓ Latest version 3.x framework

✅ Scraping Approach

API First: ✓ Intercepts and uses Albertsons internal APIs
Browser Fallback: ✓ Uses browser automation when needed
Session Management: ✓ Handles token refresh and session expiry

✅ Data Requirements

Raw Data: ✓ No transformations, preserves original structure
Required Fields: ✓ Includes rid, sourcePdpUrl, product, fullCategoryTaxonomy
GTIN/UPC: ✓ Validates presence of product identifiers
No Deduplication: ✓ Captures all product instances

✅ Exclusions Implemented

Reviews and ratings
Pickup/delivery options
Price and promotions (captured but not required)
Related/similar products
Marketplace sellers

✅ Category Management

Full inclusion/exclusion rule support
Configurable category targeting
Automatic subcategory discovery

Health Monitoring

Run health check manually:

$npm run healthcheck

The health checker validates:

Category page navigation
API connection functionality
Product data extraction
GTIN validation
Proxy connectivity

Development

Project Structure

src/
├── main.ts          # Main entry point
├── routes.ts        # Request routing logic
├── categories.ts    # Category configuration
├── types.ts         # TypeScript definitions
├── utils.ts         # Utility functions
└── healthcheck.ts   # Health monitoring

Adding New Categories

Update src/categories.ts:

export const DEFAULT_CATEGORY_CONFIG = {
    includeAll: [
        'https://www.albertsons.com/shop/aisles/new-category.html'
    ]
};

Debugging

Enable debug mode:

{
  "debugMode": true,
  "headless": false
}

Production Deployment

Apify Platform

Upload project to Apify
Configure input schema
Set up scheduling (every 4-6 weeks)
Monitor via health checker

Environment Variables

BRIGHT_DATA_ENDPOINT=wss://brd-customer-...
APIFY_PROXY_PASSWORD=your-password

Performance

Concurrency: Default 1 (recommended for stability)
Request Rate: ~2-3 seconds between requests
Session Lifetime: ~100 requests per session
Error Recovery: 3 retries with exponential backoff

Troubleshooting

Common Issues

No products found:

Check store ID validity (try 177, 154, 1680)
Verify category URLs are accessible
Check if session tokens are being captured

Session expired errors:

Automatic session refresh is implemented
Monitor for rate limiting (429 errors)
Consider reducing concurrency

Proxy issues:

Verify Bright Data credentials
Test connection with health checker
Check proxy endpoint accessibility

Support

For technical issues:

Check health checker output
Review error logs in Actor platform
Verify category URLs are current
Test with single category first

License

This scraper is designed for legitimate business use in compliance with website terms of service and applicable laws.

Kroger Fast Product Scraper

e-commerce/kroger-fast-product-scraper

Quickly scrape product details from search listing pages on Kroger

E Commerce

Albertsons Parser Spider

getdataforme/albertsons-parser-spider

GetDataForMe

Costco Scraper

newpo/costco-scraper

Scrapes Costco Products

newpo

Costco Product Scraper (Get Real Product Prices!)

parseforge/costco-scraper

Extract comprehensive product data from Costco.com including prices, member prices, descriptions, ratings, reviews, specifications, images, and availability. Supports both direct product URLs and search queries. Handles bot protection with proxy support. Perfect for price monitoring and research.

ParseForge

5.0

Costco Product Details Scraper (.com, .ca)

ecomscrape/Costco-product-details-scraper

Extract detailed Costco product information including pricing, specifications, reviews, and inventory data from Costco.com and Costco.ca. Our advanced scraper handles dynamic content, bypasses anti-bot protection, and delivers structured JSON data for market analysis, competitive intelligence, ...

ecomscrape

1.0

Albertsons Product Details Spider

getdataforme/albertsons-product-details-spider

Extract detailed product information from Albertsons' online store, including pricing, images, ratings, and inventory. Perfect for e-commerce professionals and researchers, it offers reliable scraping, proxy support, scalable batch processing, and structured JSON output for seamless integration.

GetDataForMe

Albertsons Axios Actor

getdataforme/albertsons-axios-actor

GetDataForMe

Costco Product Search Scraper (.com, .ca)

ecomscrape/costco-product-search-scraper

Extract comprehensive product data from Costco.com (or .ca) including pricing, inventory, ratings, and specifications. Our advanced scraper captures 70+ data fields from gaming computers, laptops, household items, and more for competitive analysis and market research.

ecomscrape

Costco Product Reviews Scraper

scraped/costco-product-reviews-scraper

Scrape product reviews from Costco.com

scraped

5.0

CostCo Reviews Scraper

e-commerce/costco-reviews-scraper

Quickly scrape product reviews from CostCo

E Commerce

4.3

Albertsons Product Scraper

Albertsons Crawler for Foodgraph

Features

Quick Start

Configuration

Default Categories (Foodgraph Test Project)

Input Parameters

Proxy Configuration

Output Format

Key Requirements Compliance

✅ Technology Stack

✅ Scraping Approach

✅ Data Requirements

✅ Exclusions Implemented

✅ Category Management

Health Monitoring

Development

Project Structure

Adding New Categories

Debugging

Production Deployment

Apify Platform

Environment Variables

Performance

Troubleshooting

Common Issues

Support

License

You might also like

Kroger Fast Product Scraper

Albertsons Parser Spider

Costco Scraper

Costco Product Scraper (Get Real Product Prices!)

Costco Product Details Scraper (.com, .ca)

Albertsons Product Details Spider

Albertsons Axios Actor

Costco Product Search Scraper (.com, .ca)

Costco Product Reviews Scraper

CostCo Reviews Scraper