Football Data Co Uk
Pricing
$19.99/month + usage
Football Data Co Uk
Comprehensive football match data scraper extracting historical statistics, betting odds from 15+ bookmakers across 20+ leagues. Transforms raw CSV data into structured datasets for analytics, ML models, and predictive research. Handles missing values and processes 1000+ matches/minute.
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer

Peter Tomko
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
Football Data Co UK Scraper
Comprehensive football match data scraper that extracts historical match statistics and betting odds from football-data.co.uk, providing structured datasets for football analytics and predictive modeling.
Overview
This project scrapes extensive football match data including match results, team statistics, and historical betting odds from multiple bookmakers across various leagues and seasons. It transforms raw CSV data into structured datasets suitable for football analytics, machine learning, and statistical analysis.
Key Features
- Comprehensive data extraction from 20+ football leagues across multiple seasons
- Rich match statistics including goals, shots, cards, corners, and more
- Historical betting odds from 15+ bookmakers (pre-closing and closing odds)
- Batch processing (100-item batches) for efficient dataset writes
- Data cleaning with robust handling of missing values and special characters
- Storage management with automatic dataset clearing on each run
Data Structure
Each football match is represented as a comprehensive dataclass:
@dataclassclass HistoricalMatchStats:# Metadatarefresh_timestamp: strgenerated_id: str# Match Informationseason: strleague: strdiv: strreferee: strdate: strattendance: inthome_team: straway_team: str# Match Statisticsfulltime_home_goals: intfulltime_away_goals: inthalftime_home_goals: inthalftime_away_goals: inthome_shots: intaway_shots: inthome_shots_on_target: intaway_shots_on_target: int# ... and more statistics# Betting Odds from Multiple Bookmakersb365_home_win_odds__preclosing: floatb365_draw_odds__preclosing: floatb365_away_win_odds__preclosing: float# ... and odds from 15+ bookmakers
How It Works
- Crawl Main Page: Discover league and season links from football-data.co.uk
- Download CSV Files: Fetch historical match data for each league/season combination
- Process & Clean: Transform raw CSV data into structured format with proper type conversion
- Batch Write: Collect items in batches of 100 and push to Apify dataset
- Storage Management: Clear existing dataset on each run to ensure fresh data
Setup & Usage
Prerequisites
Quick Start
# Clone and setupgit clone <repository-url>cd football-data-co-uk# Install dependencies with uvuv sync# Run locally./local-run.sh# ornpx apify-cli@latest run
Development Setup
# Activate virtual environmentsource .venv/bin/activate # On Windows: .venv\Scripts\activate# Install dependenciesuv sync# Run testsuv run python test_actor.py
Project Structure
football-data-co-uk/├── src/│ ├── __init__.py│ ├── __main__.py│ ├── main.py # Apify Actor entry point│ └── football_data_co_uk.py # Core scraper logic├── .actor/ # Apify Actor configuration├── storage/ # Local storage (development only)├── pyproject.toml # uv project configuration├── local-run.sh # Local testing script└── README.md
Data Coverage
Leagues Included
- English Premier League (E0)
- English Championship (E1)
- English League One (E2)
- English League Two (E3)
- Scottish Premier League (SC0, SC1, SC2, SC3)
- German Bundesliga (F1, F2)
- Italian Serie A (I1, I2)
- Spanish La Liga (SP1, SP2)
- French Ligue 1 (F1, F2)
- Dutch Eredivisie (N1)
- And many more...
Bookmakers Covered
- Bet365 (B365)
- Pinnacle (P)
- BetWin (BW)
- William Hill (WH)
- Betfair (BF)
- BetMGM (BMGM)
- BetVictor (BV)
- Coral (CL)
- Ladbrokes (LB)
- Interwetten (IW)
- Stan James (SJ)
- And 5+ more bookmakers
Applications
Football Analytics
- Match outcome prediction using historical data and betting odds
- Team performance analysis across seasons and leagues
- Home advantage studies and statistical modeling
Betting Analytics
- Odds movement analysis and market efficiency studies
- Value betting identification using historical odds vs actual results
- Bookmaker comparison and arbitrage opportunity detection
Research & Academia
- Sports economics research and market analysis
- Machine learning datasets for football prediction models
- Statistical analysis of football trends and patterns
Configuration
Environment Variables
- No specific environment variables required for basic operation
Customization
- Batch size: Modify
batch_sizeinsrc/main.py(default: 100) - Data cleaning: Adjust
convert_nan_to_none()function for different null value handling - League filtering: Modify
crawl_main_page()to target specific leagues
Deployment
Local Development
# Run with full logging./local-run.sh# Run with Apify CLInpx apify-cli@latest run
Apify Platform
- Push repository to GitHub
- Connect repository in Apify Console
- Configure Actor settings (memory, timeout, etc.)
- Build and deploy
Performance
- Processing speed: ~1000 matches per minute
- Data volume: 500,000+ historical matches
- Storage efficiency: Batch processing reduces API calls
- Error handling: Robust data cleaning and validation
Resources
- Football Data Co UK - Data source
- Apify SDK for Python - Actor framework
- uv Python Package Manager - Package management
- Pandas Documentation - Data processing
- Beautiful Soup - Web scraping
License
MIT
This scraper is intended for research and analytics purposes. Please respect the terms of service of football-data.co.uk and use the data responsibly.