Daily Data Feeds Scraper
Pricing
from $0.05 / 1,000 results
Daily Data Feeds Scraper
Scrapes daily datasets: VC funding, domain drops, patents, crypto prices, and news.
Pricing
from $0.05 / 1,000 results
Rating
0.0
(0)
Developer
Soft But Savage
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
18 days ago
Last modified
Categories
Share
Get structured daily records across VC funding, patent-related signals, crypto prices, and news in one Actor, with an optional experimental domain-drop feed. The output is normalized for downstream automation with stable record IDs, source URLs, ISO timestamps, is_new flags, and a per-run summary.
What does Daily Data Feeds Scraper do?
This Actor collects multiple daily feeds in one run and normalizes them into one dataset shape. It is built for scheduled collection, internal pipelines, and downstream filtering where traceability matters.
Datasets included:
- VC Funding — Latest startup funding rounds from TechCrunch
- Domain Drops — Experimental deleted-domain feed from a public source that is currently unstable
- Patents — Patent-related company signals, with an explicit fallback path when direct patent sources block requests
- Crypto Prices — Top 50 cryptocurrencies by market cap with 24h price changes
- News — Latest articles for configurable topics (AI, startups, tech layoffs, etc.)
Why use Daily Data Feeds Scraper?
- One run, multiple feeds — Funding, patents, crypto, and news stay in one scheduled workflow, with optional domain monitoring
- Stable IDs — Each record carries a deterministic
record_idfor dedupe and change detection - Source traceability — Records include
source_url, source name, and normalized timestamps - Operational visibility — Every run stores a
RUN_SUMMARYrecord in the default key-value store - Cross-run change tracking — Records carry
is_newandfirst_seen_atusing a persistent named state store - Transparent fallbacks — When a primary source blocks requests, records can carry a
source_typethat shows a fallback path was used - Pipeline-friendly shape — Shared fields make downstream filtering and storage simpler
How to use Daily Data Feeds Scraper
- Click Try for free to open the Actor
- Configure which datasets you want, or leave defaults for the most reliable feeds
- Set
max_items_per_datasetto control volume and cost - Optionally set custom news topics
- Click Run to get immediate results
- Set up a Schedule to run daily automatically
- Access records via the Dataset tab and run metadata via
RUN_SUMMARY
Input
{"datasets": ["funding", "patents", "crypto_prices", "news"],"max_items_per_dataset": 20,"news_topics": ["startup funding", "AI", "tech layoffs"]}
| Field | Type | Default | Description |
|---|---|---|---|
datasets | array | all | Which datasets to scrape |
max_items_per_dataset | integer | 20 | Maximum records to collect from each dataset |
news_topics | array | ["startup funding", "AI", "tech layoffs"] | Topics for news scraping |
Output
Results are pushed to the default dataset. Each record includes normalized core fields:
record_iddatasetentity_nameentity_typesource_urlpublished_atobserved_atfirst_seen_atrun_dateis_new
Example funding record:
{"record_id": "7f22db8b2f4f4f0d8b1bbf0abf1c6221f6e7d630","dataset": "funding","entity_name": "Startup raises $50M Series B for AI platform","entity_type": "article","source_url": "https://techcrunch.com/2026/04/09/example-round/","title": "Startup raises $50M Series B for AI platform","source": "TechCrunch","published_at": "2026-04-09T10:00:00Z","description": "The company plans to use the funding to...","observed_at": "2026-04-14T16:30:00Z","run_date": "2026-04-14"}
Example domain drop record:
{"record_id": "7ccf3f4ef5f9f2f9b88e6027bcb73dbb287bc790","dataset": "domain_drops","entity_name": "example.com","entity_type": "domain","source_url": "https://www.expireddomains.net/deleted-com-domains/","domain": "example.com","backlinks": "1240","referring_domains": "87","observed_at": "2026-04-14T16:30:00Z","run_date": "2026-04-14"}
Example crypto price record:
{"record_id": "f6b7f0af55c3240d4fe2db85df5391d3b3fd0db5","dataset": "crypto_prices","entity_name": "Bitcoin","entity_type": "crypto_asset","source_url": "https://www.coingecko.com/en/coins/bitcoin","name": "Bitcoin","symbol": "btc","price_usd": 82500.00,"change_24h_pct": -2.3,"volume_24h": 38000000000,"market_cap": 1630000000000,"observed_at": "2026-04-14T16:30:00Z","run_date": "2026-04-14"}
The Actor also writes a RUN_SUMMARY record to the default key-value store with per-dataset status, counts, and any captured error message.
Patent records may carry source_type: "news_fallback" when direct patent endpoints block automated access. domain_drops remains available as an input, but it is not part of the default run because its public source is unstable.
Data fields
| Field | Description |
|---|---|
record_id | Stable identifier for dedupe and downstream sync |
dataset | Type: funding, domain_drops, patents, crypto_prices, news |
entity_name | Normalized entity label for the record |
entity_type | Normalized type such as article, domain, patent, or crypto_asset |
source_url | Canonical source URL for the record |
source_type | Whether the record came from the primary source or a fallback path |
published_at | ISO timestamp from the source when available |
observed_at | ISO timestamp when this run captured the record |
first_seen_at | First time this Actor saw the record across runs |
run_date | Date of the Actor run |
is_new | Whether the record is new versus previously seen |
Pricing
Each Actor run costs a small amount based on compute time and results produced. Use max_items_per_dataset to keep full runs predictable and cheaper.
Estimated cost per run: $0.01–$0.05 depending on memory and result count.
Schedule daily runs to keep your data fresh for pennies per day.
Tips
- Schedule it — Go to Saved Tasks → Schedule to run automatically every morning
- Filter by dataset — Pass only the datasets you need to reduce compute time
- Use reliable defaults — Leave
domain_dropsoff unless you specifically want to test that unstable source - Use record IDs — Persist
record_idin your own system to detect new vs already-seen records - Custom news topics — Set
news_topicsto track your specific industry or competitors - Inspect run health — Read
RUN_SUMMARYfrom the default key-value store after scheduled runs
FAQ
Is this legal to use? This Actor scrapes publicly available data from public RSS feeds and public APIs. Always ensure your use case complies with the terms of service of the data sources and applicable laws in your jurisdiction.
How fresh is the data? As fresh as your last run. Schedule it daily for daily data.
Can I request additional datasets? Open an issue in the Issues tab and describe what data you need.