# Latvia Companies Registry Scraper (`parseforge/latvia-lursoft-ur-scraper`) Actor

Reach into the data.gov.lv Uznemumu Registrs dataset of Latvian companies by resource id. Add an optional free text query to filter results. Useful for KYC checks, cross border due diligence, sales prospecting in the Baltics, and building Latvia focused B2B company intelligence.

- **URL**: https://apify.com/parseforge/latvia-lursoft-ur-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Business, Lead generation, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🇱🇻 Latvia Companies Registry Scraper

> 🚀 **Export the Latvian Uznemumu Registrs (UR) open dataset from data.gov.lv into a clean, structured table.**

> 🕒 **Last updated:** 2026-06-05 · **📊 10 fields** per record · Public REST API · No login required

The Latvia Companies Registry Scraper turns the [https://data.gov.lv/dati/lv/api/3/action/datastore_search](https://data.gov.lv/dati/lv/api/3/action/datastore_search) public endpoint into a clean, structured dataset, parses the response, and flattens it into one row per record. You can scope the run with input filters and pull whichever subset you need.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 📊 Data analysts | Pull the full public dataset into a warehouse |
| 🤖 ML engineers | Build clean training sets without writing client code |
| 📰 Journalists & researchers | Verify facts in seconds |
| 👩‍💻 Developers | Mirror the upstream data into a database |
| 🏢 Product teams | Power dashboards and internal tools |
| 🎓 Students & educators | Free, structured datasets for projects |

### 📋 What the Latvia Companies Registry Scraper does

- Calls the public https://data.gov.lv/dati/lv/api/3/action/datastore_search endpoint with the input filters you supply.
- Parses the response, locates each record, and flattens it into a row.
- Casts numeric fields and surfaces upstream errors as a single record with the `error` field populated.
- Stops cleanly at `maxItems` so you never blow past your dataset budget.
- Exports to every format the Apify dataset supports , spreadsheet, warehouse, RSS, HTML, and more.

> 💡 **Why it matters:** the upstream endpoint is public but the response is verbose and not analyst-ready. This actor normalizes it into one clean row per record so the data drops straight into BigQuery, a Google Sheet, or a pandas DataFrame.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>resourceId</code></td><td>string</td><td>No</td><td>data.gov.lv resource id for the UR companies dataset.</td></tr>
<tr><td><code>query</code></td><td>string</td><td>No</td><td>Optional free-text search.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free users 10, paid users up to 1,000,000. Prefill 10.</td></tr>
</table>

**Example 1, default run:**
```json
{
  "maxItems": 10
}
````

**Example 2, larger pull:**

```json
{
  "maxItems": 500
}
```

> ⚠️ **Good to Know:** all input is validated; trailing whitespace is trimmed before the request fires. Free accounts are capped at 10 items per run as a preview; upgrade for the full dataset.

### 📊 Output

Each record is a flat object. `imageUrl` is always first, `error` is always last.

| Field | Type | Description |
|---|---|---|
| 🖼️ `imageUrl` | string | Optional image. |
| 🆔 `regNr` | string | Registration number. |
| 🏢 `name` | string | Registered company name. |
| ⚖️ `legalForm` | string | Legal form code. |
| ⚡ `status` | string | Registration status. |
| 📮 `address` | string | Registered address. |
| 📅 `registered` | string | Registration date. |
| 📦 `raw` | object | Full upstream record. |
| 🕒 `scrapedAt` | string | When this row was fetched. |
| ❌ `error` | string | Error message. |

**Sample record:**

```json
{
  "imageUrl": "example",
  "regNr": "example",
  "name": "example",
  "legalForm": "example",
  "status": "example",
  "address": "example",
  "registered": "example",
  "raw": {},
  "scrapedAt": "2026-06-05T12:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Works with the free Apify tier and the public upstream endpoint. |
| 🧹 | Clean column names ready for BI tools, spreadsheets, and warehouses. |
| 🔢 | Numeric strings auto-cast to real numbers when applicable. |
| 🛟 | Surfaces upstream rate-limit and error notes as a clean `error` record instead of crashing. |
| 🚦 | Respects `maxItems` for predictable run cost. |
| 💾 | Push to dataset, instant export to every format the Apify dataset UI supports. |

### 📈 How it compares to alternatives

| Approach | Setup time | Clean keys? | Numeric casting? | Rate-limit handling? |
|---|---|---|---|---|
| Roll your own `fetch` | 30 min + | ❌ | ❌ | ❌ |
| Generic CKAN / API client | 1 hr install + script | partial | ❌ | partial |
| **This Actor** | 5 sec, no install | ✅ | ✅ | ✅ |

### 🚀 How to use

1. Click **Try for free**.
2. Pick your input filters (or leave defaults).
3. Click **Start**. Within seconds your dataset is ready, download or pipe to your warehouse.
4. (Optional) Schedule the actor to refresh automatically.

### 💼 Business use cases

**📊 BI and reporting.** Wire the actor to a scheduled run, push results to BigQuery or Postgres, and serve a live dashboard.

**🤖 ML and feature engineering.** Build a clean labelled dataset for training without writing client code or maintaining auth.

**📰 Newsroom and research.** Verify a fact, snapshot a public record, and embed structured tables in your story.

**🧭 Operational monitoring.** Track a public dataset over time, alert on changes, and feed downstream automation.

### 🔌 Automating Latvia Companies Registry Scraper

- **Make / Zapier**: trigger this actor on a schedule, push results to Airtable, Google Sheets, or Slack.
- **Cron schedule**: native Apify scheduler, run hourly, daily, or weekly.
- **Webhooks**: get a POST to your endpoint the moment a run finishes.
- **Pipe to BigQuery / Snowflake / Postgres**: native Apify integrations move datasets straight into your warehouse.

### 🌟 Beyond business use cases

**🎓 Education.** Free, structured datasets for students learning data analysis or statistics.

**🧪 Personal research.** Track a topic you care about, build a personal data project, share it on GitHub.

**🤝 Non-profit and open data.** Power public dashboards and civic-tech projects without writing client code.

**🧰 Tinkering and prototyping.** Spin up a clean data feed in seconds to test a new chart library or app idea.

### 🤖 Ask an AI assistant about this scraper

Pop this README into ChatGPT, Claude, or any AI assistant and ask it to map your specific workflow to the actor inputs. The schema, examples, and field list above contain everything an LLM needs to design a working pipeline.

### ❓ Frequently Asked Questions

**❓ Do I need an API key?** No. The upstream endpoint is fully public.

**❓ Is there a rate limit?** The upstream endpoint may rate-limit. This actor surfaces any rate-limit notes as a clean `error` record.

**❓ Which formats can I export?** Every format the Apify dataset UI supports, including spreadsheet, warehouse, RSS, HTML, and feed formats.

**❓ Can I schedule runs?** Yes, use Apify native scheduler or hook this up to Make, Zapier, or cron.

**❓ Is this scraping or API?** API. The upstream endpoint is fully public, this actor just normalizes the response.

**❓ Will the schema change?** Core fields are stable. Source-specific fields are passed through as-is when present.

**❓ How are errors handled?** The actor never throws on the user. Upstream errors and rate-limit notes are pushed as a single record with `error` populated.

**❓ Can I limit the run size?** Yes. Set `maxItems` to cap the dataset; free accounts are auto-capped at 10.

**❓ Does it work behind a proxy?** Yes, the Apify platform handles outbound networking for you.

**❓ Is the data deduplicated?** Records are pushed as the upstream returns them; downstream deduping is up to you.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint. Trigger runs from a calendar event, a form submission, a cron job, or pipe results straight into BigQuery, Snowflake, or a Postgres warehouse.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge Alpha Vantage Scraper](https://apify.com/parseforge/alpha-vantage-public-scraper) | Stocks, FX, crypto, indicators. |
| [ParseForge OurAirports Scraper](https://apify.com/parseforge/ourairports-scraper) | Global airport database. |
| [ParseForge USGS Earthquake Scraper](https://apify.com/parseforge) | Real-time public earthquake feed. |
| [ParseForge NWS Weather Alerts Scraper](https://apify.com/parseforge) | Live US weather alerts. |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer:** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `resourceId` (type: `string`):

data.gov.lv resource id for the UR companies dataset.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `query` (type: `string`):

Optional free-text search.

## Actor input object example

```json
{
  "resourceId": "25e80bf3-f107-4ab4-89ef-251b5b9374e9",
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "resourceId": "25e80bf3-f107-4ab4-89ef-251b5b9374e9",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/latvia-lursoft-ur-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "resourceId": "25e80bf3-f107-4ab4-89ef-251b5b9374e9",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/latvia-lursoft-ur-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "resourceId": "25e80bf3-f107-4ab4-89ef-251b5b9374e9",
  "maxItems": 10
}' |
apify call parseforge/latvia-lursoft-ur-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/latvia-lursoft-ur-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Latvia Companies Registry Scraper",
        "description": "Reach into the data.gov.lv Uznemumu Registrs dataset of Latvian companies by resource id. Add an optional free text query to filter results. Useful for KYC checks, cross border due diligence, sales prospecting in the Baltics, and building Latvia focused B2B company intelligence.",
        "version": "0.1",
        "x-build-id": "Cfxqd8N7p6x4fAdCv"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~latvia-lursoft-ur-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-latvia-lursoft-ur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~latvia-lursoft-ur-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-latvia-lursoft-ur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~latvia-lursoft-ur-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-latvia-lursoft-ur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "resourceId": {
                        "title": "Resource ID",
                        "type": "string",
                        "description": "data.gov.lv resource id for the UR companies dataset."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Optional free-text search."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
