# KAPSARC Energy Data Scraper (`parseforge/kapsarc-energy-data-scraper`) Actor

Pull energy datasets straight from the KAPSARC catalog by passing a dataset id like india crude oil indicators. Supports full text search, field refinements, and sort order. Returns the raw record fields published by KAPSARC. Useful for energy research and policy analysis.

- **URL**: https://apify.com/parseforge/kapsarc-energy-data-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Automation, Integrations, Business
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## KAPSARC Energy Data Scraper

> 🚀 **Export KAPSARC energy datasets in seconds. Production, consumption, trade flows, and price indicators from KAPSARC public catalog.**

> 🕒 **Last updated:** 2026-06-05 . **📊 6 fields** per record . Public API . Real-time

The KAPSARC Energy Data Scraper turns the [datasource.kapsarc.org](https://datasource.kapsarc.org) public REST endpoint into a clean, structured dataset.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 📊 Energy analysts | Pull oil and gas indicators |
| 💼 Policy teams | Snapshot energy trade flows |
| 🤖 ML engineers | Build training sets on energy data |
| 📰 Journalists | Verify policy figures |

### 📋 What the KAPSARC Energy Data Scraper does

- Calls https://datasource.kapsarc.org/api/records/1.0/search with the dataset you supply.
- Flattens fields per record into a clean row.
- Supports text query, refine filters, and sort.
- Surfaces upstream errors as a clean error record.

> 💡 **Why it matters.** The source returns raw API payloads that most data tools cannot ingest directly. This actor normalizes everything into a flat row format ready for analysis.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free users limited to 10. Paid users up to 1,000,000. Prefill 10.</td></tr>
</table>

**Example 1.**
```json
{
  "dataset": "india-crude-oil-indicators",
  "maxItems": 100
}
````

**Example 2.**

```json
{
  "dataset": "india-crude-oil-indicators",
  "q": "refinery",
  "maxItems": 50
}
```

> ⚠️ **Good to Know.** This actor uses a public endpoint. No login or API key is required.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 📁 `datasetid` | string | KAPSARC dataset id. |
| 🆔 `recordid` | string | Unique record id. |
| 🕒 `recordTimestamp` | string | Record timestamp from upstream. |
| 📊 `fields` | object | Flattened upstream fields. |
| 🕒 `scrapedAt` | string | When this row was fetched. |
| ❌ `error` | string | Set if upstream returned an error. |

**Sample record.**

```json
{
  "datasetid": "india-crude-oil-indicators",
  "recordid": "abc123",
  "recordTimestamp": "2025-11-09T10:24:58.941Z",
  "fields": {
    "obs_value": "0.0",
    "energy_product_name": "Total",
    "flow_breakdown": "Refinery intake",
    "time_period": "2016-11-01",
    "unit_measure": "CONVBBL"
  },
  "scrapedAt": "2026-06-05T12:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Public source, no API key required. |
| 🧹 | Clean snake\_case fields ready for BI tools. |
| 🔢 | Numeric values auto-cast for spreadsheets. |
| 🛟 | Errors surfaced as a clean record instead of crashing. |
| 💾 | Push to dataset for instant export in multiple formats. |

### 📈 How it compares to alternatives

| Approach | Setup time | Clean fields? | Numeric casting? |
|---|---|---|---|
| Roll your own fetch | 30 min+ | ❌ | ❌ |
| Custom Python script | 1 hr+ | partial | partial |
| **This Actor** | 5 sec, no install | ✅ | ✅ |

### 🚀 How to use

1. Click **Try for free**.
2. Adjust `maxItems` if needed (defaults to 10).
3. Click **Start**. Within seconds, your dataset is ready for download or integration.

### 💼 Business use cases

**📊 Analytics dashboards.** Pipe results into BI tools.

**💼 Backtesting.** Snapshot data daily for reproducible analysis.

**📰 Reporting.** Embed live numbers in newsletters or briefings.

**🤖 ML feature engineering.** Build training sets from public data.

### 🔌 Automating KAPSARC Energy Data Scraper

- **Make / Zapier.** Trigger this actor on a schedule, push results to Airtable, Google Sheets, or Slack.
- **Cron schedule.** Native Apify scheduler.
- **Webhooks.** POST to your endpoint when a run finishes.
- **Pipe to BigQuery, Snowflake, Postgres.** Native Apify integrations.

### 🌟 Beyond business use cases

**🎓 Education.** Use real data in finance and economics classes.

**🧪 Personal research.** Track your own metrics without coding.

**🤝 Non-profit and open data.** Build public dashboards with current numbers.

**🧰 Tinkering and prototyping.** Spin up a feed in seconds for new ideas.

### 🤖 Ask an AI assistant about this scraper

Pop this README into ChatGPT, Claude, or any AI assistant and ask it to map your workflow to the actor inputs.

### ❓ Frequently Asked Questions

**❓ Do I need an API key?** No. KAPSARC datasource is fully public.

**❓ Where do I find dataset ids?** Browse the KAPSARC catalog at datasource.kapsarc.org/explore/.

**❓ Can I filter by year or country?** Yes, use the refineKey and refineValue inputs.

**❓ Will the schema change?** datasetid, recordid, recordTimestamp, and fields are stable.

**❓ Can I schedule runs?** Yes via Apify scheduler, Make, Zapier, or cron.

**❓ Is this scraping or API?** API. KAPSARC search endpoint is public and stable.

**❓ What if upstream returns an error?** A single record with error populated is pushed.

**❓ Can I run heavy backfills?** Yes on a paid plan up to 1,000,000 items.

**❓ Which export formats are available?** Apify dataset UI exposes multiple structured formats.

**❓ Is rate limiting strict?** KAPSARC is generous with public traffic.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge Alpha Vantage Scraper](https://apify.com/parseforge/alpha-vantage-public-scraper) | Stocks, FX, crypto, and indicators. |
| [ParseForge OurAirports Scraper](https://apify.com/parseforge/ourairports-scraper) | Global airport database. |
| [ParseForge NBA Stats Scraper](https://apify.com/parseforge/nba-stats-scraper) | Player and team stats from NBA.com. |
| [ParseForge CurseForge Mods Scraper](https://apify.com/parseforge/curseforge-mods-scraper) | Public mod metadata from CurseForge. |

> 💡 **Pro Tip.** Browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers.

***

**Disclaimer.** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `dataset` (type: `string`):

KAPSARC dataset id (e.g. india-crude-oil-indicators). Browse the catalog at datasource.kapsarc.org/explore.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `q` (type: `string`):

Optional full-text search query.

## `refineKey` (type: `string`):

Field name to refine by (e.g. country, year).

## `refineValue` (type: `string`):

Value to match in the refine field.

## `sort` (type: `string`):

Field name to sort results by. Prefix with minus for descending order.

## Actor input object example

```json
{
  "dataset": "india-crude-oil-indicators",
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "dataset": "india-crude-oil-indicators",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/kapsarc-energy-data-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "dataset": "india-crude-oil-indicators",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/kapsarc-energy-data-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "dataset": "india-crude-oil-indicators",
  "maxItems": 10
}' |
apify call parseforge/kapsarc-energy-data-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/kapsarc-energy-data-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "KAPSARC Energy Data Scraper",
        "description": "Pull energy datasets straight from the KAPSARC catalog by passing a dataset id like india crude oil indicators. Supports full text search, field refinements, and sort order. Returns the raw record fields published by KAPSARC. Useful for energy research and policy analysis.",
        "version": "0.1",
        "x-build-id": "nCNn6OzeP7TLxC8Nu"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~kapsarc-energy-data-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-kapsarc-energy-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~kapsarc-energy-data-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-kapsarc-energy-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~kapsarc-energy-data-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-kapsarc-energy-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "dataset"
                ],
                "properties": {
                    "dataset": {
                        "title": "Dataset id",
                        "type": "string",
                        "description": "KAPSARC dataset id (e.g. india-crude-oil-indicators). Browse the catalog at datasource.kapsarc.org/explore."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "q": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Optional full-text search query."
                    },
                    "refineKey": {
                        "title": "Refine field",
                        "type": "string",
                        "description": "Field name to refine by (e.g. country, year)."
                    },
                    "refineValue": {
                        "title": "Refine value",
                        "type": "string",
                        "description": "Value to match in the refine field."
                    },
                    "sort": {
                        "title": "Sort by field",
                        "type": "string",
                        "description": "Field name to sort results by. Prefix with minus for descending order."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
