# GoWork Scraper (FR, DE & ES) (`memo23/gowork-scraper`) Actor

Structured employer data (FR, DE, ES): one record per company with page and Open Graph metadata, Schema.org JSON-LD, flattened organization attributes, enriched firmographics and rating distribution, threaded reviews with replies, crawl provenance via parseMeta, and Cloudflare challenge detection.

- **URL**: https://apify.com/memo23/gowork-scraper.md
- **Developed by:** [Muhamed Didovic](https://apify.com/memo23) (community)
- **Categories:** Automation, Lead generation, AI
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $2.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### Overview

Extract structured **employer reviews and company profiles** from **[GoWork.fr](https://gowork.fr/)** (France), **[GoWork.de](https://gowork.de/)** (Germany), and **[GoWork ES](https://es.gowork.com/)** (Spain). The actor loads **HTML** pages with a **browser-like HTTP client** (Crawlee Impit), parses **Nuxt `__NUXT_DATA__`** when present (preferred for full review threads), and falls back to **JSON-LD** when needed. You get **one dataset row per company** with **flattened header fields** for CSV and a nested **`reviews`** array (each thread includes **`replies`**).

Use it to **monitor employer reputation**, **export review text and ratings**, or **feed analytics** with company metadata (contact, activity, star histogram, opening hours, trusted partners) plus traceable **`parseMeta`** for each crawl source (direct URL, search, homepage, listing hub).

---

### Features

- **Multiple entry URLs** (all routed to listing handlers or direct detail):
  - **Company profile**: `https://gowork.fr/{slug}`, `https://gowork.de/{slug}`, or `https://es.gowork.com/{slug}` (e.g. `b-hive-mulhouse`, `herole-dresden`).
  - **Search**: `…/search?…` on each host with **`page=2`** pagination (Nuxt total + page size).
  - **Homepage**: national roots such as `https://gowork.fr/`, `https://gowork.de/`, or `https://es.gowork.com/` — **recently rated** feed with **`?page=2`** pagination (sequential paginator links only; junk `page=500` links are ignored).
  - **Other listing hubs** (paths containing e.g. `/trouver`, `/recherche`, …): discovers profile links from anchors.

- **Per-company detail pass**:
  - One **HTML** request per company (until **`maxItems`** caps queued detail URLs globally).
  - **Reviews** from Nuxt **`company-reviews`** when available; otherwise **JSON-LD** `Organization.review` (subset, synthetic ids when missing).
  - Optional **`goworkOnlyRatedReviews`**: keep only thread roots with a **1–5** star rating (see Input).

- **Flattened export**:
  - **Org** fields from JSON-LD (`org_*`), **Nuxt** company block (`company*`, `business*`, `rating*`, partners JSON, etc.) at the **root** of the row alongside **`reviews`**.

---

### How to Use

1. **Set Up**: Apify account and this actor (or run locally with `apify run` / `npm run start:dev`).
2. **Provide Input**: Add one or more **GoWork URLs** under `startUrls` (and optional `url1`, `url2`, … on the same object for multiple starts).
3. **Configure**: Set **`maxItems`** (cap on **company detail** pages queued), concurrency, retries, and **proxy** (often required if Cloudflare challenges appear).
4. **Run & Export**: Download **JSON** / **CSV** from the dataset. If you see **`isCloudflareChallenge`: true** or empty Nuxt payload, use **residential proxy** or adjust client settings.

#### Usage Limitations

**Free / non-paying Apify users** may be subject to platform limits on dataset items or charges. **Paid users** typically get higher limits; adjust **`maxItems`** to control how many **company detail** pages are fetched per run. GoWork may rate-limit or challenge datacenter IPs—**proxy is recommended**.

---

### Input Configuration

Example input:

```json
{
    "startUrls": [
        {
            "url": "https://gowork.fr/b-hive-mulhouse"
        },
        {
            "url": "https://gowork.fr/search?q=fra&city=Paris"
        },
        {
            "url": "https://gowork.fr/"
        },
        {
            "url": "https://gowork.de/herole-dresden"
        },
        {
            "url": "https://es.gowork.com/"
        }
    ],
    "maxItems": 100,
    "goworkOnlyRatedReviews": false,
    "maxConcurrency": 100,
    "minConcurrency": 1,
    "maxRequestRetries": 100,
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}
````

#### Input Fields Explanation

- **startUrls** (`startUrls`): Objects whose **`url`**, **`url1`**, **`url2`**, … fields are collected in order. Use **GoWork** company URLs on **gowork.fr**, **gowork.de**, or **es.gowork.com**, **`/search?…`**, **homepage** (with optional **`?page=N`**), or supported listing-style paths.
- **maxItems** (`maxItems`): Maximum number of **company detail** pages to **queue** across the run (shared counter for search / homepage / listings). Default **100** (or as in actor schema).
- **goworkOnlyRatedReviews** (`goworkOnlyRatedReviews`): When **true**, each row’s **`reviews`** array includes only **thread roots** with a numeric **1–5** star rating; unrated text threads are dropped. Replies under kept roots stay. Default **false**.
- **maxConcurrency** / **minConcurrency** / **maxRequestRetries**: Standard Crawlee / actor concurrency and retry behavior.
- **proxy** (`proxy`): Apify proxy or custom **`proxyUrls`** for outbound requests.

***

### Output Structure

The dataset contains **one primary row type** for GoWork:

- **`gowork_detail`** — one row per **company profile** scraped: page metadata, JSON-LD snapshot, flattened org + Nuxt company fields, and the **`reviews`** array.

Filter with **`source === 'gowork_detail'`** when consuming the dataset.

***

#### Sample: `gowork_detail` (first object in `data.json`)

The JSON below is based on the **first record** of a real export. **`jsonLd`**, **`reviews`**, and long strings are **shortened** for the README; the on-disk file contains the **full** arrays. **`_readme_note`** is documentation-only and does not appear in live output.

```json
{
    "source": "gowork_detail",
    "listingId": "b-hive-mulhouse",
    "slug": "b-hive-mulhouse",
    "url": "https://gowork.fr/b-hive-mulhouse",
    "statusCode": 200,
    "originalSearchUrl": "https://gowork.fr/b-hive-mulhouse",
    "parseMeta": {
        "mode": "direct_detail_url",
        "detailPageUrl": "https://gowork.fr/b-hive-mulhouse",
        "searchIndex": 1
    },
    "scrapedAt": "2026-04-02T08:22:42.226Z",
    "pageTitle": "Avis sur B HIVE Mulhouse - 21 avis - GoWork.fr",
    "metaDescription": "Opportunités de réseautage : Travailler chez B HIVE permet…",
    "ogTitle": "Avis sur B HIVE Mulhouse - 21 avis - GoWork.fr",
    "ogDescription": "Vérifiez ce que les gens disent de B HIVE sur https://gowork.fr/ | 21 avis",
    "ogImage": "https://gowork.fr/assets/images/sharing/thread/cover-fr.jpg",
    "ogUrl": "https://gowork.fr/b-hive-mulhouse",
    "canonicalUrl": "https://gowork.fr/b-hive-mulhouse",
    "h1": "Avis B HIVE",
    "htmlLang": "fr-FR",
    "reviewCountFromTitle": 21,
    "jsonLd": [
        {
            "@context": "http://schema.org/",
            "@type": "Organization",
            "name": "B HIVE",
            "aggregateRating": { "@type": "EmployerAggregateRating", "ratingValue": 4.8, "ratingCount": 4, "reviewCount": 4 },
            "review": [ { "@type": "Review", "author": { "@type": "Person", "name": "BS" }, "reviewBody": "…", "reviewRating": { "ratingValue": 4 } } ]
        }
    ],
    "isCloudflareChallenge": false,
    "org_name": "B HIVE",
    "org_telephone": "+33 3 67 35 04 36",
    "org_tax_id": "831826649",
    "org_description": "B-HIVE est une société d'ingénierie…",
    "org_founding_date": "20170906",
    "org_street_address": "74 rue Jean Monnet, 68200 MULHOUSE",
    "org_address_locality": "Mulhouse",
    "org_address_region": "Grand Est",
    "org_address_country": "FR",
    "org_rating_value": 4.8,
    "org_rating_count": 4,
    "org_review_count": 4,
    "org_review_blocks": 4,
    "pageGlobalRating": 4.8,
    "pageGlobalReviewCount": 21,
    "statisticsRuCount": 10,
    "statisticsRuRootCount": 4,
    "reviewsIncludeAllRuThreads": true,
    "goworkOnlyRatedReviewsApplied": false,
    "pageAggregateRatingCount": 4,
    "siteLocale": "fr",
    "companyEmail": "admin@bhiveunderfloor.co.uk",
    "companyWebsite": "https://www.b-hive.fr/",
    "companyPhone": "+33 3 67 35 04 36",
    "companyLinkedInUrl": "https://fr.linkedin.com/company/b-hive-engineering",
    "companyEmployeeCountLabel": "501-1000",
    "companyBusinessArea": "Industrie manufacturière",
    "companyActivityDescription": "ingénierie conseil opérationnel…",
    "businessTradeName": "Ingénierie et architecture",
    "businessTradeSlug": "ingenierie-et-architecture",
    "ratingHistogramScoredTotal": 4,
    "ratingStar1Count": 0,
    "ratingStar2Count": 0,
    "ratingStar3Count": 0,
    "ratingStar4Count": 1,
    "ratingStar5Count": 3,
    "ratingStar1Percent": 0,
    "ratingStar2Percent": 0,
    "ratingStar3Percent": 0,
    "ratingStar4Percent": 25,
    "ratingStar5Percent": 75,
    "companyCapital": "50000 EUR",
    "companyFoundedDate": "2017-09-06",
    "companyActivityShortLabel": "Ingénierie, études techniques",
    "companyOpeningHoursJson": "{\"lundi\":\"08:00–19:00\",…}",
    "companyTrustedPartnersJson": "[{\"name\":\"OPTIM 67\",\"profileUrl\":\"https://gowork.fr/…\",…}]",
    "reviews": [
        {
            "reviewId": "e270ac96-fed0-4eed-bf23-6681d41ec643",
            "reviewerName": "Charlotte",
            "reviewDate": "17-02-2026 13:11",
            "content": "Les avis sur B HIVE semblent très positifs…",
            "ratingValue": null,
            "languageCode": "fr",
            "authorKind": "SU",
            "replies": []
        },
        {
            "reviewId": "38c2e5bc-75ed-4a35-a526-899a98c12311",
            "reviewerName": "BS",
            "reviewDate": "07-08-2023 16:22",
            "content": "Entreprise jeune en croissance",
            "ratingValue": 4,
            "languageCode": "fr",
            "authorKind": "ANONYMOUS",
            "replies": [
                {
                    "replyId": "511b3dfe-314f-4a82-a248-bc8713fd2257",
                    "authorName": "Audrey",
                    "content": "Pourriez vous me dire ce qui fait la particularité…",
                    "date": "08-08-2023 11:41",
                    "authorKind": "SU"
                }
            ]
        }
    ],
    "_readme_note": "Omitted here: remaining review threads, full jsonLd reviews, optional mainTextPreview when present."
}
```

***

#### Output fields (`gowork_detail`) — field-by-field

##### Row identity and request metadata

- **`source`** — Always **`gowork_detail`** for GoWork company rows.
- **`listingId`** — Company **slug** (same as URL path segment); stable key for joins.
- **`slug`** — Duplicate of **`listingId`** for clarity in exports.
- **`url`** — Final **HTML** URL fetched for this company.
- **`statusCode`** — HTTP status of that response (**200** when OK).
- **`originalSearchUrl`** — Original URL that led to this detail (direct detail URL, search page, homepage, or listing page) from crawl **`userData`**.
- **`parseMeta`** — How this company was discovered and extra context:
  - **`mode`** — e.g. **`direct_detail_url`**, **`from_search`**, **`from_homepage`**, **`from_listing_page`**.
  - **`detailPageUrl`** — Set for direct starts (**`mode: direct_detail_url`**).
  - **`listingPageUrl`** — Listing / search / homepage URL when enqueued from a hub.
  - **`searchPageUrl`** — Search results URL when **`mode`** is **`from_search`**.
  - **`homepageListUrl`** / **`homepagePage`** — Homepage URL and **1-based** page index when **`mode`** is **`from_homepage`**.
  - **`searchIndex`** — 1-based index among **`startUrls`** when provided by the crawler (direct detail flow).
- **`scrapedAt`** — ISO timestamp when the row was written.

##### Page-level HTML metadata

- **`pageTitle`** — Contents of **`<title>`**.
- **`metaDescription`** — **`meta[name=description]`** content when present.
- **`ogTitle`** — Open Graph **`og:title`**.
- **`ogDescription`** — Open Graph **`og:description`**.
- **`ogImage`** — Open Graph **`og:image`**.
- **`ogUrl`** — Open Graph **`og:url`**.
- **`canonicalUrl`** — **`link[rel=canonical]`** `href` when present.
- **`h1`** — First **`<h1>`** text (best-effort selectors).
- **`htmlLang`** — **`html[lang]`** attribute (e.g. **`fr-FR`**).
- **`reviewCountFromTitle`** — Integer parsed from title / OG hints like **“21 avis”** when regex matches; else **`null`** / omitted.
- **`mainTextPreview`** — When present, a long plain-text preview of the main content region (capped in the parser); omitted or empty if selectors find nothing useful.

##### Raw JSON-LD

- **`jsonLd`** — Array of parsed **JSON-LD** objects from **`script[type=application/ld+json]`** blocks. Typically includes **`@type: Organization`** with **`aggregateRating`**, **`review`**, address, **`taxID`**, etc. Used for fallback reviews when Nuxt is missing.

##### Challenge flag

- **`isCloudflareChallenge`** — **`true`** when the HTML looks like a Cloudflare interstitial (**“Just a moment”**); parsing quality may be poor—use proxy / browser if this stays **true**.

##### Organization (flattened from JSON-LD)

- **`org_name`** — Organization **`name`**.
- **`org_telephone`** — **`telephone`**.
- **`org_tax_id`** — **`taxID`** (e.g. SIREN-style id when provided).
- **`org_description`** — **`description`**.
- **`org_founding_date`** — **`foundingDate`** as in schema (often **`YYYYMMDD`**).
- **`org_street_address`** — **`address.streetAddress`**.
- **`org_address_locality`** — **`address.addressLocality`**.
- **`org_address_region`** — **`address.addressRegion`**.
- **`org_address_country`** — **`address.addressCountry`**.
- **`org_rating_value`** — **`aggregateRating.ratingValue`** (employer aggregate).
- **`org_rating_count`** — **`aggregateRating.ratingCount`** when present.
- **`org_review_count`** — **`aggregateRating.reviewCount`** when present.
- **`org_review_blocks`** — Count of **`review`** entries embedded in JSON-LD (subset of site threads).

##### Nuxt / global stats (employer rating and review counters)

- **`pageGlobalRating`** — Star rating from Nuxt **company** payload (matches header UI when present).
- **`pageGlobalReviewCount`** — Broad review counter (**`statistics.reviewsCount`**), same idea as **“N avis”** in the title (e.g. **21**); **not** always equal to length of **`reviews`** in HTML.
- **`statisticsRuCount`** — Number of **thread roots** shipped in **`company-reviews`** for this SSR payload.
- **`statisticsRuRootCount`** — **Rated** root count used for aggregates (often closer to JSON-LD review count).
- **`reviewsIncludeAllRuThreads`** — **`true`** when **`reviews.length === statisticsRuCount`** (all SSR threads captured). **`null`** when **`goworkOnlyRatedReviews`** is **true** (counts not comparable).
- **`goworkOnlyRatedReviewsApplied`** — **`true`** if the input filter **only rated roots** was active for this row.
- **`pageAggregateRatingCount`** — Sample size from **company.rating** aggregate (often aligns with JSON-LD **~4**).
- **`siteLocale`** — Pinia / Nuxt locale (e.g. **`fr`**).

##### Company profile (flattened for CSV)

- **`companyEmail`** — Contact email from Nuxt **infoGraph** when present.
- **`companyWebsite`** — Website URL (**`web_page_found`** or **`web_page`**).
- **`companyPhone`** — Phone.
- **`companyLinkedInUrl`** — LinkedIn profile URL.
- **`companyEmployeeCountLabel`** — Employee range label (e.g. **501-1000**).
- **`companyBusinessArea`** — Business area / sector label.
- **`companyActivityDescription`** — Long activity text (truncated in parser for very long strings).
- **`businessTradeName`** — Trade / category display name.
- **`businessTradeSlug`** — Trade slug for URLs.
- **`ratingHistogramScoredTotal`** — Sum of star bucket counts (matches “based on N ratings” when complete).
- **`ratingStar1Count`** … **`ratingStar5Count`** — Counts per star level **1–5**.
- **`ratingStar1Percent`** … **`ratingStar5Percent`** — Percentages (**0–100**) derived from those counts.
- **`companyCapital`** — Capital string (e.g. **`50000 EUR`**).
- **`companyFoundedDate`** — **`YYYY-MM-DD`** from **`infoGraph.date`** when **`YYYYMMDD`**.
- **`companyActivityShortLabel`** — Short activity line (distinct from long description).
- **`companyOpeningHoursJson`** — JSON string: map **day → hours** (e.g. **`08:00–19:00`**, **`Fermé`**).
- **`companyTrustedPartnersJson`** — JSON string: array of **`{ name, profileUrl, city, logoUrl, companyId }`** for “trusted companies” when present.

##### Reviews (Nuxt threads, JSON-LD fallback)

- **`reviews`** — Array of **thread root** objects. Each object:
  - **`reviewId`** — GoWork review UUID (or **synthetic** id in JSON-LD fallback).
  - **`reviewerName`** — Display name (may be **"-"** or empty for anonymous).
  - **`reviewDate`** — Date/time string as on site.
  - **`content`** — Main review / post body.
  - **`ratingValue`** — **1–5** stars when rated; **`null`** for text-only / question-style threads.
  - **`languageCode`** — Primary language code (from page / locale).
  - **`authorKind`** — Author type flag from Nuxt (e.g. **`ANONYMOUS`**, **`SU`**).
  - **`role`** — Optional role (e.g. **`candidate`**) when present.
  - **`replies`** — Array of first-level replies on this thread:
    - **`replyId`** — Reply UUID.
    - **`authorName`** — Reply author display name.
    - **`content`** — Reply body.
    - **`date`** — Reply date/time string.
    - **`authorKind`** — Reply author kind when present.

***

### Benefits of the GoWork scraper

- **One row per company** with **reviews nested** but **company fields flat** for spreadsheets.
- **Honest counters**: distinguish **title / global review count** from **SSR thread count** and **JSON-LD** subset via **`pageGlobalReviewCount`**, **`statisticsRuCount`**, **`org_review_blocks`**.
- **Traceability**: **`parseMeta`** records whether the row came from **search**, **homepage**, **listing**, or **direct** URL.
- **Optional rated-only export** for clients who want **star reviews only** (**`goworkOnlyRatedReviews`**).

***

### Why Choose This Actor?

Built for **French, German, and Spanish employer review research** on GoWork: company discovery from **search**, **homepage**, or **hubs**, then **full profile + threads** where Nuxt allows. Outputs are suitable for **warehouses**, **BI**, or **CRM** enrichment.

**Use cases:**

- Track reviews and aggregates for a **watchlist** of employers.
- Export **Q\&A-style threads** and **star reviews** with **replies** for NLP or moderation.
- Combine **flat firmographics** (contact, capital, hours, partners) with **review content**.

***

### Technical Implementation

1. **URL routing** (`gowork-mapper.ts`): Detects **gowork.fr**, **gowork.de**, and **es.gowork.com** hosts, **detail** slug paths, **search**, **homepage**, and listing hints; builds **CheerioCrawler** requests with **`userData`** (slug, **`goworkOnlyRatedReviews`**, **`maxItems`**, etc.).
2. **Listing handlers** (`routes.ts` — **`GOWORK_LISTINGS`**): Collects company URLs from **anchors**, Nuxt **search SERP**, **homepage `index`** (recently rated + static company strips on page 1), paginates **search** (total + page size) and **homepage** (exact **`page+1`** paginator links, cap **200**).
3. **Detail handler** (`routes.ts` — **`GOWORK_DETAIL`**): Parses HTML with **`parseGoworkDetailHtml`** (`gowork-detail-parser.ts`): Nuxt **`extractGoworkFromNuxtPayload`**, **`extractGoworkCompanyFlat`**, JSON-LD org flattening, optional **rated-only** filter; pushes one dataset row.

***

### Explore More Scrapers

If you found this actor useful, check out other scrapers at [memo23's Apify profile](https://apify.com/memo23).

***

### Support

- For issues or feature requests, use the **Issues** section of this actor on Apify.
- For further assistance, contact the author:
  - Author's website: <https://muhamed-didovic.github.io/>
  - Email: <muhamed.didovic@gmail.com>

***

### Additional Services

- Request customization or a full dataset: <muhamed.didovic@gmail.com>
- Need other platforms scraped? Contact <muhamed.didovic@gmail.com>
- For API services of this actor, reach out to <muhamed.didovic@gmail.com>
- Custom integrations and automation solutions available

# Actor input Schema

## `startUrls` (type: `array`):

ZonaProp listing or clasificado URLs. GoWork: `https://gowork.fr/`, `https://gowork.de/`, or Spain `https://es.gowork.com/` — homepage (review feed `?page=2`), company profile slug paths, or search (`/search?…`).

## `goworkOnlyRatedReviews` (type: `boolean`):

When enabled, each company row includes only review threads that have a 1–5 star rating. Text-only comments or Q\&A without stars are excluded from the `reviews` array. Replies under kept threads are still included.

## `maxItems` (type: `integer`):

Maximum number of listings to scrape per run.

## `maxConcurrency` (type: `integer`):

Maximum concurrent requests.

## `minConcurrency` (type: `integer`):

Minimum concurrent requests.

## `maxRequestRetries` (type: `integer`):

Retries per failed request.

## `proxy` (type: `object`):

Proxy settings for the crawler.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.zonaprop.com.ar/departamentos-venta-capital-federal.html"
    }
  ],
  "goworkOnlyRatedReviews": false,
  "maxItems": 10000,
  "maxConcurrency": 100,
  "minConcurrency": 1,
  "maxRequestRetries": 100,
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.zonaprop.com.ar/departamentos-venta-capital-federal.html"
        }
    ],
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("memo23/gowork-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.zonaprop.com.ar/departamentos-venta-capital-federal.html" }],
    "proxy": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("memo23/gowork-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.zonaprop.com.ar/departamentos-venta-capital-federal.html"
    }
  ],
  "proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call memo23/gowork-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=memo23/gowork-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "GoWork Scraper (FR, DE & ES)",
        "description": "Structured employer data (FR, DE, ES): one record per company with page and Open Graph metadata, Schema.org JSON-LD, flattened organization attributes, enriched firmographics and rating distribution, threaded reviews with replies, crawl provenance via parseMeta, and Cloudflare challenge detection.",
        "version": "0.0",
        "x-build-id": "EDHy1AqJjjSnIVmtx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/memo23~gowork-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-memo23-gowork-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/memo23~gowork-scraper/runs": {
            "post": {
                "operationId": "runs-sync-memo23-gowork-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/memo23~gowork-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-memo23-gowork-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "ZonaProp listing or clasificado URLs. GoWork: `https://gowork.fr/`, `https://gowork.de/`, or Spain `https://es.gowork.com/` — homepage (review feed `?page=2`), company profile slug paths, or search (`/search?…`).",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "goworkOnlyRatedReviews": {
                        "title": "Only reviews with star ratings",
                        "type": "boolean",
                        "description": "When enabled, each company row includes only review threads that have a 1–5 star rating. Text-only comments or Q&A without stars are excluded from the `reviews` array. Replies under kept threads are still included.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Results Limit",
                        "type": "integer",
                        "description": "Maximum number of listings to scrape per run.",
                        "default": 10000
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrency",
                        "type": "integer",
                        "description": "Maximum concurrent requests.",
                        "default": 100
                    },
                    "minConcurrency": {
                        "title": "Min Concurrency",
                        "type": "integer",
                        "description": "Minimum concurrent requests.",
                        "default": 1
                    },
                    "maxRequestRetries": {
                        "title": "Max Request Retries",
                        "type": "integer",
                        "description": "Retries per failed request.",
                        "default": 100
                    },
                    "proxy": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings for the crawler.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
