# Daily Data Feeds Scraper (`soft_but_savage/daily-datasets`) Actor

Scrapes daily datasets: VC funding, domain drops, patents, crypto prices, and news.

- **URL**: https://apify.com/soft\_but\_savage/daily-datasets.md
- **Developed by:** [Tahira Muhammad](https://apify.com/soft_but_savage) (community)
- **Categories:** News
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.05 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Daily Data Feeds Scraper

Get **fresh, structured data delivered daily** across 5 high-value datasets — VC funding, domain drops, patents, crypto prices, and news. Built for data pipelines, market research bots, and competitive intelligence systems.

### What does Daily Data Feeds Scraper do?

This Actor automatically scrapes and structures data from multiple sources every time it runs. Schedule it daily and wake up to fresh datasets ready to consume via the Apify API. No manual work, no stale data.

**Datasets included:**
- **VC Funding** — Latest startup funding rounds from TechCrunch
- **Domain Drops** — Expiring .com domains with backlink and referring domain counts
- **Patents** — New patent filings from Google, Apple, Microsoft, Amazon, Meta, and OpenAI
- **Crypto Prices** — Top 50 cryptocurrencies by market cap with 24h price changes
- **News** — Latest articles for configurable topics (AI, startups, tech layoffs, etc.)

### Why use Daily Data Feeds Scraper?

- **Market intelligence** — Track VC activity, patent filings, and company moves daily
- **Domain investing** — Find expired domains with existing authority before others do
- **Financial bots** — Feed crypto price data into trading or alert systems
- **AI agent pipelines** — Give your AI agents fresh real-world data to work with
- **Research automation** — Stop manually checking sources — let the Actor do it

### How to use Daily Data Feeds Scraper

1. Click **Try for free** to open the Actor
2. Configure which datasets you want (or leave defaults to get all)
3. Optionally set custom news topics
4. Click **Run** to get immediate results
5. Set up a **Schedule** to run daily automatically
6. Access results via the **Dataset** tab or Apify API

### Input

```json
{
  "datasets": ["funding", "domain_drops", "patents", "crypto_prices", "news"],
  "news_topics": ["startup funding", "AI", "tech layoffs"]
}
````

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `datasets` | array | all | Which datasets to scrape |
| `news_topics` | array | `["startup funding", "AI", "tech layoffs"]` | Topics for news scraping |

### Output

Results are pushed to the default dataset. Each record includes a `dataset` field identifying its type.

**Example funding record:**

```json
{
  "dataset": "funding",
  "title": "Startup raises $50M Series B for AI platform",
  "source": "TechCrunch",
  "published": "Wed, 09 Apr 2026 10:00:00 GMT",
  "description": "The company plans to use the funding to...",
  "date": "2026-04-09"
}
```

**Example domain drop record:**

```json
{
  "dataset": "domain_drops",
  "domain": "example.com",
  "backlinks": "1240",
  "referring_domains": "87",
  "date": "2026-04-09"
}
```

**Example crypto price record:**

```json
{
  "dataset": "crypto_prices",
  "name": "Bitcoin",
  "symbol": "btc",
  "price_usd": 82500.00,
  "change_24h_pct": -2.3,
  "volume_24h": 38000000000,
  "market_cap": 1630000000000,
  "date": "2026-04-09"
}
```

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel from the Dataset tab.

### Data fields

| Field | Description |
|-------|-------------|
| `dataset` | Type: funding, domain\_drops, patents, crypto\_prices, news |
| `title` | Title or name of the record |
| `date` | Date the data was scraped |
| `source` | Where the data came from |
| `published` | Original publication date (where applicable) |

### Pricing

Each Actor run costs a small amount based on compute time and results produced. A typical full run scraping all 5 datasets produces 150–300 records and completes in under 2 minutes.

**Estimated cost per run:** $0.01–$0.05 depending on memory and result count.

Schedule daily runs to keep your data fresh for pennies per day.

### Tips

- **Schedule it** — Go to Saved Tasks → Schedule to run automatically every morning
- **Filter by dataset** — Pass only the datasets you need to reduce compute time
- **Custom news topics** — Set `news_topics` to track your specific industry or competitors
- **Integrate with webhooks** — Trigger downstream systems when new data arrives

### FAQ

**Is this legal to use?**
This Actor scrapes publicly available data from public RSS feeds and public APIs. Always ensure your use case complies with the terms of service of the data sources and applicable laws in your jurisdiction.

**How fresh is the data?**
As fresh as your last run. Schedule it daily for daily data.

**Can I request additional datasets?**
Open an issue in the Issues tab and describe what data you need.

# Actor input Schema

## `datasets` (type: `array`):

Choose which daily datasets to scrape in this run.

## `news_topics` (type: `array`):

Topics used when scraping Google News RSS. Only used if `news` is included in datasets.

## Actor input object example

```json
{
  "datasets": [
    "funding",
    "domain_drops",
    "patents",
    "crypto_prices",
    "news"
  ],
  "news_topics": [
    "startup funding",
    "AI",
    "tech layoffs"
  ]
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "datasets": [
        "funding",
        "domain_drops",
        "patents",
        "crypto_prices",
        "news"
    ],
    "news_topics": [
        "startup funding",
        "AI",
        "tech layoffs"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("soft_but_savage/daily-datasets").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "datasets": [
        "funding",
        "domain_drops",
        "patents",
        "crypto_prices",
        "news",
    ],
    "news_topics": [
        "startup funding",
        "AI",
        "tech layoffs",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("soft_but_savage/daily-datasets").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "datasets": [
    "funding",
    "domain_drops",
    "patents",
    "crypto_prices",
    "news"
  ],
  "news_topics": [
    "startup funding",
    "AI",
    "tech layoffs"
  ]
}' |
apify call soft_but_savage/daily-datasets --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=soft_but_savage/daily-datasets",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Daily Data Feeds Scraper",
        "description": "Scrapes daily datasets: VC funding, domain drops, patents, crypto prices, and news.",
        "version": "1.0",
        "x-build-id": "ngkef1kawJDFkNmLh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/soft_but_savage~daily-datasets/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-soft_but_savage-daily-datasets",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/soft_but_savage~daily-datasets/runs": {
            "post": {
                "operationId": "runs-sync-soft_but_savage-daily-datasets",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/soft_but_savage~daily-datasets/run-sync": {
            "post": {
                "operationId": "run-sync-soft_but_savage-daily-datasets",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "datasets": {
                        "title": "Datasets",
                        "type": "array",
                        "description": "Choose which daily datasets to scrape in this run.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "news_topics": {
                        "title": "News topics",
                        "type": "array",
                        "description": "Topics used when scraping Google News RSS. Only used if `news` is included in datasets.",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
