Football Data Co Uk avatar

Football Data Co Uk

Pricing

$19.99/month + usage

Go to Apify Store
Football Data Co Uk

Football Data Co Uk

Comprehensive football match data scraper extracting historical statistics, betting odds from 15+ bookmakers across 20+ leagues. Transforms raw CSV data into structured datasets for analytics, ML models, and predictive research. Handles missing values and processes 1000+ matches/minute.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Peter Tomko

Peter Tomko

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

6 days ago

Last modified

Share

Football Data Co UK Scraper

Comprehensive football match data scraper that extracts historical match statistics and betting odds from football-data.co.uk, providing structured datasets for football analytics and predictive modeling.

Overview

This project scrapes extensive football match data including match results, team statistics, and historical betting odds from multiple bookmakers across various leagues and seasons. It transforms raw CSV data into structured datasets suitable for football analytics, machine learning, and statistical analysis.

Key Features

  • Comprehensive data extraction from 20+ football leagues across multiple seasons
  • Rich match statistics including goals, shots, cards, corners, and more
  • Historical betting odds from 15+ bookmakers (pre-closing and closing odds)
  • Batch processing (100-item batches) for efficient dataset writes
  • Data cleaning with robust handling of missing values and special characters
  • Storage management with automatic dataset clearing on each run

Data Structure

Each football match is represented as a comprehensive dataclass:

@dataclass
class HistoricalMatchStats:
# Metadata
refresh_timestamp: str
generated_id: str
# Match Information
season: str
league: str
div: str
referee: str
date: str
attendance: int
home_team: str
away_team: str
# Match Statistics
fulltime_home_goals: int
fulltime_away_goals: int
halftime_home_goals: int
halftime_away_goals: int
home_shots: int
away_shots: int
home_shots_on_target: int
away_shots_on_target: int
# ... and more statistics
# Betting Odds from Multiple Bookmakers
b365_home_win_odds__preclosing: float
b365_draw_odds__preclosing: float
b365_away_win_odds__preclosing: float
# ... and odds from 15+ bookmakers

How It Works

  1. Crawl Main Page: Discover league and season links from football-data.co.uk
  2. Download CSV Files: Fetch historical match data for each league/season combination
  3. Process & Clean: Transform raw CSV data into structured format with proper type conversion
  4. Batch Write: Collect items in batches of 100 and push to Apify dataset
  5. Storage Management: Clear existing dataset on each run to ensure fresh data

Setup & Usage

Prerequisites

Quick Start

# Clone and setup
git clone <repository-url>
cd football-data-co-uk
# Install dependencies with uv
uv sync
# Run locally
./local-run.sh
# or
npx apify-cli@latest run

Development Setup

# Activate virtual environment
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
uv sync
# Run tests
uv run python test_actor.py

Project Structure

football-data-co-uk/
├── src/
│ ├── __init__.py
│ ├── __main__.py
│ ├── main.py # Apify Actor entry point
│ └── football_data_co_uk.py # Core scraper logic
├── .actor/ # Apify Actor configuration
├── storage/ # Local storage (development only)
├── pyproject.toml # uv project configuration
├── local-run.sh # Local testing script
└── README.md

Data Coverage

Leagues Included

  • English Premier League (E0)
  • English Championship (E1)
  • English League One (E2)
  • English League Two (E3)
  • Scottish Premier League (SC0, SC1, SC2, SC3)
  • German Bundesliga (F1, F2)
  • Italian Serie A (I1, I2)
  • Spanish La Liga (SP1, SP2)
  • French Ligue 1 (F1, F2)
  • Dutch Eredivisie (N1)
  • And many more...

Bookmakers Covered

  • Bet365 (B365)
  • Pinnacle (P)
  • BetWin (BW)
  • William Hill (WH)
  • Betfair (BF)
  • BetMGM (BMGM)
  • BetVictor (BV)
  • Coral (CL)
  • Ladbrokes (LB)
  • Interwetten (IW)
  • Stan James (SJ)
  • And 5+ more bookmakers

Applications

Football Analytics

  • Match outcome prediction using historical data and betting odds
  • Team performance analysis across seasons and leagues
  • Home advantage studies and statistical modeling

Betting Analytics

  • Odds movement analysis and market efficiency studies
  • Value betting identification using historical odds vs actual results
  • Bookmaker comparison and arbitrage opportunity detection

Research & Academia

  • Sports economics research and market analysis
  • Machine learning datasets for football prediction models
  • Statistical analysis of football trends and patterns

Configuration

Environment Variables

  • No specific environment variables required for basic operation

Customization

  • Batch size: Modify batch_size in src/main.py (default: 100)
  • Data cleaning: Adjust convert_nan_to_none() function for different null value handling
  • League filtering: Modify crawl_main_page() to target specific leagues

Deployment

Local Development

# Run with full logging
./local-run.sh
# Run with Apify CLI
npx apify-cli@latest run

Apify Platform

  1. Push repository to GitHub
  2. Connect repository in Apify Console
  3. Configure Actor settings (memory, timeout, etc.)
  4. Build and deploy

Performance

  • Processing speed: ~1000 matches per minute
  • Data volume: 500,000+ historical matches
  • Storage efficiency: Batch processing reduces API calls
  • Error handling: Robust data cleaning and validation

Resources

License

MIT


This scraper is intended for research and analytics purposes. Please respect the terms of service of football-data.co.uk and use the data responsibly.