Gvt Procurement Scraper
Pricing
from $1.00 / actor start
Gvt Procurement Scraper
Scrape tender notices & contract awards from 11 government procurement databases across 9 countries. Covers US, EU, UK, Ukraine, France, Brazil, Australia & Canada. Filter by keyword, date, and contract value.
Pricing
from $1.00 / actor start
Rating
0.0
(0)
Developer

Zac
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
Government & Public Procurement Data Scraper
A powerful Apify actor for scraping government tender databases and contract awards from multiple sources including SAM.gov, TED EU, and UK Find a Tender.
Features
- Multi-Source Support: Simultaneously scrape from multiple government procurement databases
- Flexible Data Extraction: Extract tender notices, contract awards, or both
- Advanced Filtering: Filter by keywords, date range, contract value, and more
- Multiple Output Formats: Export results as JSON, CSV, or JSONL
- Proxy Support: Built-in support for Apify's residential proxies to avoid blocking
- Error Handling: Robust error handling with detailed logging
- Rate Limiting: Automatic rate limiting to respect source websites
Supported Databases
- SAM.gov (USA Federal Procurement)
- TED EU (European Union Tenders)
- UK Find a Tender (United Kingdom Government Contracts)
Installation
- Clone or create this actor in Apify Console
- Install dependencies:
$npm install
Usage
Input Parameters
{"sources": ["sam.gov", "ted.eu", "uk.findatender"],"dataTypes": ["tender_notices", "contract_awards"],"searchQuery": "software development","keywords": ["IT", "cloud"],"dateFrom": "2024-01-01","dateTo": "2024-12-31","maxResults": 5000,"minContractValue": 50000,"maxContractValue": "5000000","outputFormat": "json","useProxy": true,"browserTimeout": 60,"debugMode": false}
Parameters Explained
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | array | Data sources to scrape | ["sam.gov"] |
dataTypes | array | Types of data to extract | ["tender_notices", "contract_awards"] |
searchQuery | string | Search keywords (e.g., "software", "consulting") | "" |
keywords | array | Additional filter keywords | [] |
dateFrom | string | Start date (YYYY-MM-DD format) | "" |
dateTo | string | End date (YYYY-MM-DD format) | "" |
maxResults | integer | Maximum results to collect (1-10000) | 1000 |
minContractValue | number | Minimum contract value in USD | 0 |
maxContractValue | string | Maximum contract value in USD | "" |
outputFormat | string | Output format: json, csv, jsonl | "json" |
useProxy | boolean | Use Apify residential proxy | false |
browserTimeout | integer | Timeout for browser operations (10-600s) | 60 |
debugMode | boolean | Enable verbose logging | false |
Output
Results are stored in Apify's Key-Value Store with the following structure:
{"id": "notice_id","source": "sam.gov","type": "tender_notice","title": "Tender Title","description": "Full tender description","url": "https://...","postedDate": "2024-01-15T10:30:00Z","deadline": "2024-02-15","organization": "Agency Name","budget": "500000","metadata": {"country": "US","category": "Software","reference": "SOL-2024-001"},"scrapedAt": "2024-01-20T12:00:00Z"}
Running Locally
$npm start
Set environment variable DEBUG=true for verbose logging:
$DEBUG=true npm start
API Documentation
SamGovScraper
Scrapes tender opportunities from SAM.gov using their REST API.
const scraper = new SamGovScraper(input);const results = await scraper.scrape();
TedEuScraper
Scrapes tender notices from TED EU using browser automation.
const scraper = new TedEuScraper(input);const results = await scraper.scrape();
UkFindTenderScraper
Scrapes government contracts from UK Find a Tender using browser automation.
const scraper = new UkFindTenderScraper(input);const results = await scraper.scrape();
DataProcessor
Utility class for formatting and filtering results.
const processor = new DataProcessor('json');const formatted = processor.format(data);// Helper methodsDataProcessor.normalizeTender(tender);DataProcessor.filterByKeywords(tenders, ['IT', 'cloud']);DataProcessor.sortByDate(tenders, 'desc');DataProcessor.removeDuplicates(tenders);
Best Practices
- Rate Limiting: The scrapers include built-in rate limiting. Don't remove delays between requests.
- Proxy Usage: For high-volume scraping or repeated runs, enable proxy support to avoid IP blocking.
- Search Strategy: Be specific with search queries to get more relevant results and reduce processing time.
- Date Ranges: Use date filters to limit scope and reduce unnecessary processing.
- Keywords: Use keywords to post-filter results for higher relevance.
Error Handling
The actor captures and logs errors from each source separately. If one source fails, others continue processing. A summary of errors is provided in the output.
Performance Tips
- Limit Results: Set
maxResultsto a reasonable number to reduce processing time - Narrow Date Range: Use
dateFromanddateToto focus on recent tenders - Use Keywords: Pre-filter with keywords to reduce unnecessary data collection
- Disable Debug Mode: Keep
debugMode: falsefor production runs for better performance
Troubleshooting
No Results Returned
- Check if the source website is accessible
- Verify search parameters are valid
- Try with a broader search query
- Check if proxy is needed due to IP blocking
Timeout Errors
- Increase
browserTimeoutparameter - Try with fewer results or narrower date range
- Enable proxy support for more reliable connections
Rate Limit Errors
- Reduce
maxResultsor split into smaller date ranges - Increase delays between requests (modify scraper code)
- Use Apify residential proxies
License
Apache 2.0
Support
For issues or questions, please create an issue in the repository or contact the development team.
Contributing
Contributions are welcome! Please follow the existing code style and add tests for new features.
Adding a New Source
- Create a new scraper file in
src/scrapers/(e.g.,newSourceScraper.js) - Implement the scraper class extending the base pattern
- Add the scraper to
src/main.jsin the scrapers map - Update input schema in
.actor/input_schema.json - Add documentation in this README
Roadmap
- Add support for additional databases (France, Germany, Canada)
- Implement contract award extraction for all sources
- Add email notification on new matching tenders
- Create web dashboard for tracking results
- Add machine learning for opportunity relevance scoring
- Implement incremental scraping with state management