# Subito.it Scraper (`shahidirfan/subito-it-scraper`) Actor

Extract classified listings from Subito.it at scale. Scrape prices, product details, seller information, and item descriptions for market research, competitor analysis, and lead generation. Fast, reliable bulk data extraction from Italy's largest classifieds marketplace.

- **URL**: https://apify.com/shahidirfan/subito-it-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Automation, Lead generation, E-commerce
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Subito.it Scraper

Collect rich Subito.it listing data from search pages, category pages, and generic marketplace URLs. This actor is built for fast, large-scale extraction with stable pagination and structured output for analysis, monitoring, and automation.

### Features

- **Flexible input modes** — Use a Subito URL or use keyword and location inputs.
- **Category-aware extraction** — Supports category URLs and resolves category filters automatically.
- **Pagination control** — Collect exactly the amount of data you need with page and result limits.
- **Clean dataset output** — Removes null values from output records before saving.
- **Rich listing fields** — Returns listing metadata, pricing, geo info, images, advertiser info, and full feature sets.

---

### Use Cases

#### Marketplace Monitoring
Track active listings and observe how categories, pricing, and volume change over time.

#### Competitive Research
Build datasets for category-level and keyword-level intelligence across multiple markets.

#### Lead Generation
Extract filtered listing feeds for business development and sales operations.

#### Pricing Analysis
Monitor historical and current pricing trends by category, location, and listing type.

#### BI and Reporting
Export structured data to dashboards, spreadsheets, and data warehouses.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `startUrl` | String | No | `"https://www.subito.it/annunci-italia/vendita/usato/"` | Subito URL to start from. |
| `keyword` | String | No | `"occhiali moscot"` | Search keyword used when URL has no query term. |
| `location` | String | No | `"Roma"` | Optional location text combined with keyword. |
| `results_wanted` | Integer | No | `20` | Maximum number of records to save. |
| `max_pages` | Integer | No | `10` | Maximum number of pages to fetch. |
| `proxyConfiguration` | Object | No | Apify Proxy residential | Proxy settings for reliable production runs. |

---

### Output Data

Each dataset item can contain the following fields:

| Field | Type | Description |
|---|---|---|
| `urn` | String | Unique listing identifier. |
| `title` | String | Listing title. |
| `description` | String | Listing description text. |
| `ad_type` | String | Listing type label (for example In vendita). |
| `ad_type_key` | String | Compact type key. |
| `category` | String | Category name. |
| `category_id` | String | Category ID. |
| `category_slug` | String | Category slug. |
| `macrocategory_id` | String | Macro category ID when present. |
| `posted_at` | String | Publish timestamp. |
| `expires_at` | String | Expiration timestamp. |
| `price` | Number or String | Price parsed from listing features. |
| `shipping_cost` | Number or String | Shipping cost when available. |
| `image_count` | Integer | Number of listing images. |
| `image_urls` | Array | Listing image URLs. |
| `region` | String | Region at top-level for easier filtering. |
| `region_id` | String | Region ID at top-level. |
| `city` | String | City at top-level for easier filtering. |
| `city_id` | String | City ID at top-level. |
| `town` | String | Town at top-level. |
| `town_id` | String | Town ID at top-level. |
| `advertiser` | Object | Advertiser data (id, name, company, type). |
| `geo` | Object | Region, city, and town information. |
| `url` | String | Listing URL. |
| `mobile_url` | String | Mobile listing URL. |
| `features` | Object | Structured listing features. |

---

### Usage Examples

#### Marketplace URL

```json
{
	"startUrl": "https://www.subito.it/annunci-italia/vendita/usato/",
	"results_wanted": 20,
	"max_pages": 10
}
````

#### Keyword Search

```json
{
	"keyword": "occhiali moscot",
	"location": "Roma",
	"results_wanted": 50,
	"max_pages": 20
}
```

#### Category URL Search

```json
{
	"startUrl": "https://www.subito.it/annunci-italia/vendita/auto/",
	"results_wanted": 30,
	"max_pages": 15
}
```

***

### Sample Output

```json
{
	"urn": "id:ad:b7cf54df-9aca-4df7-a4b9-dfc5dd8ed058:list:639348779",
	"title": "Occhiali Moscot Lemtosh Matte Black",
	"description": "Occhiale Lemtosh Matte Black with custom made tints...",
	"ad_type": "In vendita",
	"ad_type_key": "s",
	"category": "Abbigliamento e Accessori",
	"category_id": "16",
	"posted_at": "2026-04-20T08:26:36.564+0200",
	"price": 330,
	"image_count": 4,
	"geo": {
		"region": "Lazio",
		"region_id": "11",
		"city": "Roma",
		"city_id": "4",
		"town": "Roma",
		"town_id": "058091"
	},
	"url": "https://www.subito.it/abbigliamento-accessori/occhiali-moscot-lemtosh-matte-black-roma-639348779.htm"
}
```

***

### Tips For Best Results

#### Use Valid Subito URLs

- Prefer category and search-result URLs directly from Subito.
- Keep URL parameters when you need exact filtering behavior.

#### Start Small, Then Scale

- Start with `results_wanted: 20` for quick validation.
- Increase limits once your query pattern is confirmed.

#### Use Proxies In Production

- Residential proxies help stabilize larger runs.
- Keep retries and timeouts conservative for long schedules.

***

### Proxy Configuration

```json
{
	"proxyConfiguration": {
		"useApifyProxy": true,
		"apifyProxyGroups": ["RESIDENTIAL"]
	}
}
```

***

### Integrations

- **Google Sheets** — Build quick listing trackers.
- **Airtable** — Create searchable listing databases.
- **Make** — Trigger downstream automations.
- **Zapier** — Connect listing events to business tools.
- **Webhooks** — Push fresh data to your systems.

#### Export Formats

- **JSON** — Best for APIs and custom pipelines.
- **CSV** — Best for spreadsheet analysis.
- **Excel** — Best for operational reporting.
- **XML** — Best for legacy integrations.

***

### Frequently Asked Questions

#### Can I use only keyword and location without URL?

Yes. Provide `keyword` and optionally `location`, and the actor will run search extraction.

#### Does the actor support category URLs?

Yes. Category slugs are resolved and used to query matching listing data.

#### Are empty or null fields saved?

No. Output records are cleaned so null values are removed before saving.

#### Can I collect thousands of records?

Yes. Increase `results_wanted` and `max_pages` based on your use case and runtime constraints.

#### Is proxy configuration required?

Not always, but recommended for production stability and higher-volume extraction.

***

### Support

For issues or feature requests, open the actor in Apify Console and use the support options there.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)
- [Apify Schedules](https://docs.apify.com/platform/schedules)

***

### Legal Notice

This actor is intended for legitimate data collection and analytics workflows. You are responsible for complying with website terms and all applicable laws in your jurisdiction.

# Actor input Schema

## `startUrl` (type: `string`):

A Subito listing URL, for example https://www.subito.it/annunci-italia/vendita/usato/

## `keyword` (type: `string`):

Search keyword used when URL has no query term.

## `location` (type: `string`):

Optional location text merged into the keyword search.

## `results_wanted` (type: `integer`):

Maximum number of records to save.

## `max_pages` (type: `integer`):

Maximum number of API pages to fetch.

## `proxyConfiguration` (type: `object`):

Use Apify Proxy for stable production runs.

## Actor input object example

```json
{
  "startUrl": "https://www.subito.it/annunci-italia/vendita/motori/",
  "keyword": "occhiali moscot",
  "location": "Roma",
  "results_wanted": 20,
  "max_pages": 10,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrl": "https://www.subito.it/annunci-italia/vendita/motori/",
    "keyword": "occhiali moscot",
    "location": "Roma",
    "results_wanted": 20,
    "max_pages": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/subito-it-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrl": "https://www.subito.it/annunci-italia/vendita/motori/",
    "keyword": "occhiali moscot",
    "location": "Roma",
    "results_wanted": 20,
    "max_pages": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/subito-it-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrl": "https://www.subito.it/annunci-italia/vendita/motori/",
  "keyword": "occhiali moscot",
  "location": "Roma",
  "results_wanted": 20,
  "max_pages": 10
}' |
apify call shahidirfan/subito-it-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/subito-it-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Subito.it Scraper",
        "description": "Extract classified listings from Subito.it at scale. Scrape prices, product details, seller information, and item descriptions for market research, competitor analysis, and lead generation. Fast, reliable bulk data extraction from Italy's largest classifieds marketplace.",
        "version": "0.0",
        "x-build-id": "ZlPjctogN8MVZrKX2"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~subito-it-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-subito-it-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~subito-it-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-subito-it-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~subito-it-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-subito-it-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrl": {
                        "title": "URL",
                        "type": "string",
                        "description": "A Subito listing URL, for example https://www.subito.it/annunci-italia/vendita/usato/"
                    },
                    "keyword": {
                        "title": "Keyword",
                        "type": "string",
                        "description": "Search keyword used when URL has no query term."
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "Optional location text merged into the keyword search."
                    },
                    "results_wanted": {
                        "title": "Results wanted",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of records to save.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of API pages to fetch.",
                        "default": 10
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Use Apify Proxy for stable production runs.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
