# Gupy.io Jobs Scraper (`shahidirfan/gupy-io-jobs-scraper`) Actor

Extract job listings from Gupy, Brazil's leading recruitment platform. Scrape job titles, company details, salary ranges, and application links in seconds. Perfect for job boards, data analysis, and recruitment automation. Get structured datasets with zero coding required.

- **URL**: https://apify.com/shahidirfan/gupy-io-jobs-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Jobs, Automation, Developer tools
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Gupy.io Jobs Scraper

Collect job listings from the Gupy jobs portal in a clean, analysis-ready format. Capture job titles, companies, locations, work models, publish dates, application deadlines, descriptions, skills, and direct job links at scale.

### Features

- **Portal URL support** - Use a `portal.gupy.io/job-search` URL to preserve the exact filters you see on the site
- **Keyword and location search** - Build a search without copying a full URL
- **Pagination controls** - Limit both the number of results and the number of pages
- **Normalized output** - Get consistent field names with empty values omitted from the dataset
- **Store-ready dataset** - Save structured job records for analysis, monitoring, or export

### Use Cases

#### Job Market Research

Track hiring volume, job titles, and location patterns across the Gupy ecosystem. Build datasets for recruiting research, salary benchmarking support, or trend analysis.

#### Lead Generation

Identify companies hiring for specific roles, regions, or work models. Use the dataset to monitor employers and career pages relevant to your niche.

#### Competitive Intelligence

Compare open roles, publication cadence, and workplace models across employers. Spot which companies are growing, hiring remotely, or expanding into new locations.

#### Recruitment Automation

Feed job results into spreadsheets, internal dashboards, or workflow tools. Use recurring runs to keep job pipelines fresh without manual searching.

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `url` | String | No | `https://portal.gupy.io/job-search/sortBy=publishedDate` | Gupy search URL from `portal.gupy.io/job-search`. If provided, its filters are used first. |
| `keyword` | String | No | — | Optional job keyword when you do not want to use a full URL. |
| `location` | String | No | — | Optional city or state such as `Sao Paulo`, `Sao Paulo - SP`, or `Pernambuco`. |
| `sortBy` | String | No | `publishedDate` | Sort order used when searching with keyword and location. |
| `results_wanted` | Integer | No | `20` | Maximum number of jobs to collect. |
| `max_pages` | Integer | No | `1` | Safety cap for pagination. |
| `proxyConfiguration` | Object | No | `{"useApifyProxy": false}` | Optional Apify proxy settings. |

### Output Data

Each dataset item can contain:

| Field | Type | Description |
|-------|------|-------------|
| `jobId` | Integer | Unique Gupy job identifier. |
| `title` | String | Job title. |
| `company` | String | Company or career page name. |
| `companyId` | Integer | Company identifier. |
| `careerPageId` | Integer | Career page identifier. |
| `careerPageName` | String | Career page name from Gupy. |
| `careerPageLogoUrl` | String | Company logo URL. |
| `careerPageUrl` | String | Career page URL. |
| `descriptionHtml` | String | Rich job description content. |
| `descriptionText` | String | Plain-text job description. |
| `jobType` | String | Raw job type code. |
| `jobTypeLabel` | String | Readable job type label. |
| `publishedDate` | String | Publish timestamp. |
| `applicationDeadline` | String | Application deadline when available. |
| `isRemoteWork` | Boolean | Whether the role is marked as remote. |
| `workplaceType` | String | Raw workplace type code. |
| `workplaceTypeLabel` | String | Readable work model label. |
| `city` | String | Job city. |
| `state` | String | Job state. |
| `country` | String | Job country. |
| `location` | String | Combined location string. |
| `jobUrl` | String | Direct link to the vacancy. |
| `acceptsDisabilities` | Boolean | Whether the vacancy is flagged for PWD applicants. |
| `skills` | Array | Skills when present. |
| `sourceUrl` | String | Source search URL used for the run. |

Empty values are not stored, so records stay compact and easier to work with.

### Usage Examples

#### Latest Jobs

```json
{
  "url": "https://portal.gupy.io/job-search/sortBy=publishedDate",
  "results_wanted": 20,
  "max_pages": 1
}
````

#### Keyword Search

```json
{
  "keyword": "Social Media",
  "results_wanted": 30,
  "max_pages": 2
}
```

#### Location Search

```json
{
  "location": "Sao Paulo - SP",
  "results_wanted": 25,
  "max_pages": 2
}
```

#### Filtered Portal URL

```json
{
  "url": "https://portal.gupy.io/job-search/workplaceTypes[]=remote",
  "results_wanted": 40,
  "max_pages": 3
}
```

### Sample Output

```json
{
  "jobId": 11331333,
  "title": "ATENDENTE RESTAURANTE 12X36 ( CENTRO - BELFORD ROXO/RJ)",
  "company": "McDonald's Restaurante - Arcos Dorados",
  "companyId": 68123,
  "careerPageId": 164080,
  "careerPageName": "McDonald's Restaurante - Arcos Dorados",
  "careerPageLogoUrl": "https://attachments.gupy.io/production/companies/68123/career/164080/images/2023-07-20_22-47_companyLogoUrl.png",
  "careerPageUrl": "https://restaurantemc.gupy.io/eyJzb3VyY2UiOiJndXB5X3BvcnRhbCJ9",
  "descriptionHtml": "#A gente vai amar muito se voce...",
  "descriptionText": "#A gente vai amar muito se voce... Responsabilidades e atribuicoes...",
  "jobType": "vacancy_type_effective",
  "jobTypeLabel": "Effective",
  "publishedDate": "2026-05-21T03:00:25.306Z",
  "applicationDeadline": "2026-07-20",
  "isRemoteWork": false,
  "workplaceType": "on-site",
  "workplaceTypeLabel": "On-site",
  "city": "Belford Roxo",
  "state": "Rio de Janeiro",
  "country": "Brasil",
  "location": "Belford Roxo, Rio de Janeiro, Brasil",
  "jobUrl": "https://restaurantemc.gupy.io/job/eyJqb2JJZCI6MTEzMzEzMzMsInNvdXJjZSI6Imd1cHlfcG9ydGFsIn0=?jobBoardSource=gupy_portal",
  "acceptsDisabilities": true,
  "sourceUrl": "https://portal.gupy.io/job-search/sortBy=publishedDate"
}
```

### Tips for Best Results

#### Use Real Portal URLs for Complex Filters

When you need exact work model or job type filters, copy the full Gupy search URL from your browser. This is the easiest way to mirror what you see on the portal.

#### Start Small

Use `results_wanted: 20` and `max_pages: 1` or `2` while validating a new search. Increase limits once you confirm the search returns the right kind of jobs.

#### Use Clear Location Inputs

For better filtering, use values such as `Sao Paulo`, `Sao Paulo - SP`, or full state names like `Pernambuco`.

#### Watch Large Searches

Broad searches can return thousands of jobs. Set `max_pages` intentionally to control runtime and output size.

### Proxy Configuration

If you want to route requests through Apify Proxy:

```json
{
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

### Integrations

- **Google Sheets** - Export job data for reporting
- **Airtable** - Build searchable hiring databases
- **Zapier** - Trigger follow-up workflows
- **Make** - Connect recurring job runs to other tools
- **Webhooks** - Send results into your own systems

### Export Formats

- **JSON** - Ideal for APIs and downstream processing
- **CSV** - Easy to review in spreadsheets
- **Excel** - Business reporting and sharing
- **XML** - Legacy system integrations

# Actor input Schema

## `url` (type: `string`):

A URL from portal.gupy.io/job-search. If provided, the actor uses the filters from this URL first.

## `keyword` (type: `string`):

Optional job keyword when you do not want to use a full Gupy URL.

## `location` (type: `string`):

Optional city or state, for example Sao Paulo, Sao Paulo - SP, or Pernambuco.

## `sortBy` (type: `string`):

Sort order used when you build the search from keyword and location.

## `results_wanted` (type: `integer`):

Maximum number of jobs to collect.

## `max_pages` (type: `integer`):

Safety limit for paginated API requests.

## `proxyConfiguration` (type: `object`):

Optional Apify proxy settings.

## Actor input object example

```json
{
  "url": "https://portal.gupy.io/job-search/sortBy=publishedDate",
  "sortBy": "publishedDate",
  "results_wanted": 20,
  "max_pages": 1,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://portal.gupy.io/job-search/sortBy=publishedDate",
    "results_wanted": 20,
    "max_pages": 1
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/gupy-io-jobs-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://portal.gupy.io/job-search/sortBy=publishedDate",
    "results_wanted": 20,
    "max_pages": 1,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/gupy-io-jobs-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://portal.gupy.io/job-search/sortBy=publishedDate",
  "results_wanted": 20,
  "max_pages": 1
}' |
apify call shahidirfan/gupy-io-jobs-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/gupy-io-jobs-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Gupy.io Jobs Scraper",
        "description": "Extract job listings from Gupy, Brazil's leading recruitment platform. Scrape job titles, company details, salary ranges, and application links in seconds. Perfect for job boards, data analysis, and recruitment automation. Get structured datasets with zero coding required.",
        "version": "0.0",
        "x-build-id": "jJW7m4eFT85luzro3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~gupy-io-jobs-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-gupy-io-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~gupy-io-jobs-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-gupy-io-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~gupy-io-jobs-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-gupy-io-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "url": {
                        "title": "Gupy search URL",
                        "type": "string",
                        "description": "A URL from portal.gupy.io/job-search. If provided, the actor uses the filters from this URL first."
                    },
                    "keyword": {
                        "title": "Keyword",
                        "type": "string",
                        "description": "Optional job keyword when you do not want to use a full Gupy URL."
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "Optional city or state, for example Sao Paulo, Sao Paulo - SP, or Pernambuco."
                    },
                    "sortBy": {
                        "title": "Sort by",
                        "enum": [
                            "publishedDate"
                        ],
                        "type": "string",
                        "description": "Sort order used when you build the search from keyword and location.",
                        "default": "publishedDate"
                    },
                    "results_wanted": {
                        "title": "Results wanted",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of jobs to collect.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety limit for paginated API requests.",
                        "default": 1
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional Apify proxy settings.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
