Government Tender Scraper
Pricing
from $20.00 / 1,000 results
Government Tender Scraper
A powerful Apify Actor that scrapes government tender listings from multiple official portals across different countries into a single normalized dataset. Built with the Adapter/Plugin pattern for easy extensibility.
Pricing
from $20.00 / 1,000 results
Rating
0.0
(0)
Developer

HappiTap
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
a minute ago
Last modified
Categories
Share
A powerful Apify Actor that scrapes government tender listings from multiple official portals across different countries into a single normalized dataset. Built with the Adapter/Plugin pattern for easy extensibility.
๐ Supported Portals
| Region | Portal | Official URL | Notes |
|---|---|---|---|
| ๐บ๐ธ USA | SAM.gov | sam.gov | Federal procurement opportunities |
| ๐ช๐บ EU | TED | ted.europa.eu | EU-wide tenders |
| ๐ฌ๐ง UK | Find a Tender | gov.uk/find-tender | UK post-Brexit procurement |
| ๐ฎ๐ณ India | CPPP (GePNIC) | etenders.gov.in | High volume Indian tenders |
| ๐จ๐ฆ Canada | CanadaBuys | canadabuys.canada.ca | Canadian government procurement |
โจ Features
- Multi-portal support with unified output schema
- Smart deduplication across runs
- Keyword filtering with include/exclude support
- Date range filtering for published and deadline dates
- Detail page scraping with full tender information
- Document/attachment tracking
- Webhook notifications for new tenders
- Proxy support with Apify proxy integration
- Error handling with retry logic and classification
- Metrics & observability with detailed statistics
๐ Quick Start
Example 1: US Tenders (Last 7 Days)
{"country": "US","keywords": ["IT", "software"],"publishedLastNDays": 7,"maxItems": 200,"includeDetails": true}
Example 2: UK Construction Tenders
{"country": "UK","keywords": ["construction", "road"],"deadlineTo": "2024-02-28","maxItems": 100}
Example 3: India IT Tenders with Value Filter
{"country": "IN","keywords": ["software", "development"],"publishedLastNDays": 14,"minValue": 1000000,"maxItems": 500}
Example 4: EU Cybersecurity Tenders
{"country": "EU","keywords": ["cybersecurity", "security"],"category": "72000000","maxItems": 300}
Example 5: Canada Healthcare Tenders
{"country": "CA","keywords": ["healthcare", "medical"],"region": "Ontario","publishedLastNDays": 30,"maxItems": 150}
๐ Input Parameters
Required
- country (string) - Country/region to scrape:
US,EU,UK,IN,CA
Search Filters
- keywords (array) - Keywords to search for (OR logic)
- excludeKeywords (array) - Keywords to exclude from results
- dateFrom (string) - Published date from (YYYY-MM-DD)
- dateTo (string) - Published date to (YYYY-MM-DD)
- publishedLastNDays (number) - Filter last N days (overrides dateFrom/dateTo)
- deadlineFrom (string) - Deadline date from (YYYY-MM-DD)
- deadlineTo (string) - Deadline date to (YYYY-MM-DD)
- buyerName (string) - Filter by buyer/agency name
- region (string) - Filter by geographic location
- category (string) - Filter by category (CPV/NAICS codes)
- minValue (number) - Minimum tender value
- maxValue (number) - Maximum tender value
Crawl Controls
- maxItems (number, default: 500) - Maximum tenders to scrape
- maxPages (number, default: 50) - Maximum list pages to scrape
- includeDetails (boolean, default: true) - Scrape full detail pages
- downloadAttachments (boolean, default: false) - Download documents (experimental)
- startUrls (array) - Override with custom start URLs
Runtime Settings
- useProxy (boolean, default: true) - Use Apify proxy
- proxyGroups (array, default: ["RESIDENTIAL"]) - Proxy groups to use
- maxConcurrency (number, default: 10) - Concurrent requests
- requestRetries (number, default: 5) - Retry attempts
- minDelayMs (number, default: 1000) - Minimum delay between requests
- maxDelayMs (number, default: 3000) - Maximum delay between requests
Integration
- webhookUrl (string) - URL for webhook notifications
- webhookSecret (string) - Secret for webhook authentication
- notifyOnlyNew (boolean, default: true) - Only notify for new tenders
๐ Output Schema
Each tender is normalized into this unified format:
{"sourceCountry": "US","sourcePortal": "sam_gov","sourceId": "12345678","title": "IT Services Contract","tenderUrl": "https://sam.gov/opp/12345678/view","buyerName": "Department of Defense","publishedAt": "2024-01-15T00:00:00.000Z","deadlineAt": "2024-02-15T23:59:59.000Z","scrapedAt": "2024-01-20T10:30:00.000Z","hash": "abc123def456...","summary": "Procurement for IT services...","status": "open","procurementType": "services","categoryCodes": ["541512"],"locations": [{"country": "US","state": "VA","city": "Arlington"}],"estimatedValue": {"amount": 1000000,"currency": "USD"},"documents": [{"name": "RFP Document","url": "https://...","type": "attachment","sizeBytes": 524288}],"contact": {"name": "John Doe","email": "john.doe@agency.gov","phone": "+1-555-0100"},"awardedTo": null,"raw": { }}
๐๏ธ Architecture
Adapter Pattern
The actor uses a plugin-based architecture where each portal has its own adapter:
src/โโโ adapters/โ โโโ us/SamGovAdapter.js # USA (SAM.gov)โ โโโ eu/TedAdapter.js # EU (TED)โ โโโ uk/FindATenderAdapter.js # UKโ โโโ in/CpppAdapter.js # India (CPPP)โ โโโ ca/CanadaBuysAdapter.js # Canadaโโโ core/โ โโโ BaseAdapter.js # Base adapter interfaceโ โโโ AdapterRegistry.js # Adapter registryโ โโโ TenderCrawler.js # Main crawler engineโ โโโ DedupeManager.js # Deduplication logicโ โโโ MetricsCollector.js # Statistics trackingโ โโโ ErrorHandler.js # Error classificationโ โโโ WebhookNotifier.js # Webhook integrationโโโ main.js # Entry point
Adding New Portals
To add a new portal, create a new adapter extending BaseAdapter:
import { BaseAdapter } from '../../core/BaseAdapter.js';export class NewPortalAdapter extends BaseAdapter {getSourceCountry() {return 'XX';}getSourcePortal() {return 'new_portal';}async buildStartRequests(normalizedQuery) {// Build initial search URLs}async parseListPage(context) {// Parse list page and extract tender stubsreturn { items: [], nextRequests: [] };}async parseDetailPage(context, stub) {// Parse detail page and return normalized tenderreturn this.normalize(tender);}}
Then register it in AdapterRegistry.js:
this.register('XX', NewPortalAdapter);
๐ Metrics & Monitoring
The actor provides detailed metrics:
- tendersFound - Total tenders discovered
- tendersSaved - Tenders saved to dataset
- detailsFetched - Detail pages scraped
- duplicatesSkipped - Duplicate tenders filtered
- parsingErrors - Parsing failures
- networkErrors - Network/timeout errors
- blockedCount - Blocked requests (403/401)
- captchaCount - Captcha encounters
- retryCount - Total retry attempts
Access metrics in the Key-Value Store under the STATS key.
๐ง Error Handling
Errors are classified into types:
- BLOCKED - 403/401 responses (proxy/IP issues)
- RATE_LIMITED - 429 responses (too many requests)
- CAPTCHA - Captcha detection
- PARSING_ERROR - HTML/JSON parsing failures
- NETWORK_ERROR - Timeouts, connection errors
- UNKNOWN - Other errors
Failed requests are logged to the Key-Value Store under ERROR_LOG.
๐ Webhook Integration
Configure webhooks to receive notifications for new tenders:
{"webhookUrl": "https://your-api.com/webhook","webhookSecret": "your-secret-key","notifyOnlyNew": true}
Webhook payload:
{"event": "new_tenders","timestamp": "2024-01-20T10:30:00.000Z","count": 5,"tenders": [ ]}
The webhook includes an X-Webhook-Signature header with HMAC-SHA256 signature for verification.
๐ Incremental Crawling
The actor supports incremental crawling with state persistence:
- Deduplication state is saved between runs
- Previously seen tenders are automatically skipped
- Perfect for scheduled runs (daily/weekly)
๐งช Testing
Run with a small dataset first:
{"country": "US","keywords": ["test"],"maxItems": 10,"maxPages": 2}
๐ Best Practices
- Start small - Test with
maxItems: 10first - Use proxies - Always enable
useProxy: truefor production - Set delays - Use
minDelayMsandmaxDelayMsto avoid rate limits - Filter early - Use keywords and date filters to reduce load
- Monitor metrics - Check
STATSandERROR_LOGafter runs - Schedule runs - Use Apify Scheduler for daily/weekly updates
๐จ Limitations
- Some portals may require authentication (not currently supported)
- Captcha protection may block automated access
- Rate limits vary by portal
- Document downloads are experimental
- Some portals may change their HTML structure
๐ License
Apache-2.0
๐ค Contributing
To add support for new portals:
- Create a new adapter in
src/adapters/ - Implement the required methods
- Register in
AdapterRegistry.js - Test thoroughly
- Submit a pull request
๐ก Use Cases
- Tender monitoring - Track opportunities in your industry
- Market research - Analyze government spending patterns
- Competitive intelligence - Monitor competitor wins
- Lead generation - Find relevant procurement opportunities
- Data analysis - Build datasets for research
- Automated alerts - Get notified of matching tenders
๐ Support
For issues or questions:
- Check the error logs in Key-Value Store
- Review metrics in
STATS - Enable debug logging
- Contact Apify support
Built with โค๏ธ using Apify SDK and Crawlee