# Resume / Candidate Profile Scraper (`coregent/resume-candidate-profile-scraper`) Actor

Extract structured candidate data from public resume, portfolio, GitHub, and profile URLs into flat, CSV-ready rows with skills, visible contacts, profile links, and a completeness score — no login, cookies, or residential proxy.

- **URL**: https://apify.com/coregent/resume-candidate-profile-scraper.md
- **Developed by:** [Delowar Munna](https://apify.com/coregent) (community)
- **Categories:** Jobs, Automation, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.40 / 1,000 candidate-results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Resume / Candidate Profile Scraper

![Resume / Candidate Profile Scraper](https://raw.githubusercontent.com/coregentdevspace/resume-candidate-profile-scraper-assets/main/thumbnail-resume-candidate-profile-scraper.png)

Turn **public resume, portfolio, GitHub, and profile URLs** into clean, flat, **CSV-ready candidate records** — names, titles, skills, location, publicly visible contacts, profile links, and a transparent **profile-completeness score**. Built for **recruiters, sourcing agencies, HR analysts, and staffing teams**.

**Public-only. No login, no cookies, no sessions, no residential proxy, no paid enrichment.** You supply a list of public URLs; the actor fetches each one over HTTP, parses HTML pages and directly public PDF resumes, and returns one flat row per candidate. You pay one flat event per saved unique candidate.

### ✨ Why this scraper

- **Public-only & safe** — no logged-in LinkedIn / Indeed Resume / Naukri / Seek private databases, no cookies, no credentials. Login-required pages are skipped, never bypassed.
- **Mixed inputs** — public HTML profiles, personal sites, portfolios, public GitHub profiles, and directly public PDF/text resumes, all into **one stable schema**.
- **32 flat fields** — identity, visible contacts, profile links, source tracking, detected skills, completeness score. No nested objects; drops straight into Sheets/Excel/CRMs.
- **Transparent completeness score** — rule-based (no AI), explained below.
- **Pay-Per-Event** — one flat `candidate-result` event per saved unique candidate. Failed, skipped, duplicate, and filtered rows are never charged.

---

### 🚀 Quick start — sample inputs

#### Example 1 — mixed public URLs with skill detection

```json
{
    "startUrls": [
        { "url": "https://github.com/addyosmani" },
        { "url": "https://github.com/sindresorhus" },
        { "url": "https://kentcdodds.com" }
    ],
    "sourceType": "auto",
    "maxResults": 100,
    "includePdfText": true,
    "skillKeywords": ["TypeScript", "React", "AWS", "Node.js"],
    "deduplicate": true,
    "proxyConfiguration": { "useApifyProxy": true }
}
````

#### Example 2 — filtered shortlist + custom residential proxy via your own provider

```json
{
    "startUrls": [
        { "url": "https://github.com/yyx990803" },
        { "url": "https://github.com/antfu" },
        { "url": "https://github.com/getify" },
        { "url": "https://feross.org" }
    ],
    "maxResults": 250,
    "requiredKeywords": ["typescript"],
    "minCompletenessScore": 50,
    "deduplicate": true,
    "proxyConfiguration": {
        "useApifyProxy": false,
        "proxyUrls": ["http://user:pass@proxy.iproyal.com:12321"]
    }
}
```

> Tip: a public resume URL like `{ "url": "https://example.com/jane-doe-resume.pdf" }` also works — directly public PDF resumes are parsed with `includePdfText: true` and fill the `education_summary` / `experience_summary` / `certifications_text` fields that profile pages usually leave empty.

> Provide **at least one** valid public HTTP/HTTPS URL in `startUrls`. Unsupported protocols (`file:`, `ftp:`, `mailto:`, `tel:`) are rejected, and duplicate URLs are removed before crawling.

> The actor blocks Apify Residential proxy; if you need residential routing, supply your own provider via `proxyConfiguration.proxyUrls` as shown. See **🚦 Proxy policy** below.

***

### 📦 Output

The dataset has one view: **Candidates** — a 32-column flat table.

![Resume / Candidate Profile Scraper — all-fields table view](https://raw.githubusercontent.com/coregentdevspace/resume-candidate-profile-scraper-assets/main/resume-candidate-profile-scraper-output-all-fields-table-view.png)

#### Output fields (32)

`candidate_name`, `headline`, `current_title`, `current_company`, `location_text`, `email`, `phone`, `website_url`, `linkedin_url`, `github_url`, `portfolio_url`, `source_url`, `canonical_url`, `source_domain`, `source_type`, `resume_file_type`, `skills_detected`, `skill_count`, `matched_keywords`, `experience_years_text`, `education_summary`, `experience_summary`, `certifications_text`, `languages_text`, `public_contact_available`, `profile_completeness_score`, `profile_quality_label`, `reason_tags`, `page_title`, `page_text_snippet`, `input_index`, `scraped_at`.

Scalar fields fall back to `null`, comma-joined lists to `""`, counts/scores to `0`, and booleans to `false` when a value isn't visibly present.

#### Sample records — Candidates

Real output rows (public GitHub / personal-site profiles). Fields populate from what's publicly visible — resume-section fields (`education_summary`, `experience_summary`, `certifications_text`) are blank on profile pages and fill in from public **resume PDFs**.

A public GitHub profile (`github_profile`):

```json
{
    "candidate_name": "Addy Osmani",
    "headline": "Director at Google working on Gemini and Google Cloud",
    "current_title": "Director",
    "current_company": "Google",
    "location_text": "Sunnyvale, California",
    "email": null,
    "phone": null,
    "website_url": "https://www.addyosmani.com/",
    "linkedin_url": "https://www.linkedin.com/in/addyosmani",
    "github_url": "https://github.com/addyosmani",
    "portfolio_url": null,
    "source_url": "https://github.com/addyosmani",
    "canonical_url": "https://github.com/addyosmani",
    "source_domain": "github.com",
    "source_type": "github_profile",
    "resume_file_type": "html",
    "skills_detected": "javascript, html, css, react, vue, angular, google cloud",
    "skill_count": 7,
    "matched_keywords": "react",
    "experience_years_text": null,
    "education_summary": null,
    "experience_summary": null,
    "certifications_text": null,
    "languages_text": null,
    "public_contact_available": false,
    "profile_completeness_score": 70,
    "profile_quality_label": "high",
    "reason_tags": "has_linkedin,has_github,has_skills,keyword_match",
    "page_title": "addyosmani (Addy Osmani) · GitHub",
    "page_text_snippet": null,
    "input_index": 4,
    "scraped_at": "2026-06-07T12:33:34.659Z"
}
```

A personal-site profile (`public_profile`) with a visible contact:

```json
{
    "candidate_name": "Lee Robinson",
    "headline": "VP of Developer Experience",
    "current_title": "VP of Developer Experience",
    "current_company": "Cursor",
    "location_text": null,
    "email": "lee@leerob.com",
    "phone": null,
    "website_url": "https://leerob.com/",
    "linkedin_url": "https://www.linkedin.com/in/leeerob",
    "github_url": "https://github.com/leerob",
    "portfolio_url": null,
    "source_url": "https://leerob.com/",
    "canonical_url": "https://leerob.com/",
    "source_domain": "leerob.com",
    "source_type": "public_profile",
    "resume_file_type": "html",
    "skills_detected": "",
    "skill_count": 0,
    "matched_keywords": "",
    "experience_years_text": "15 years",
    "education_summary": null,
    "experience_summary": null,
    "certifications_text": null,
    "languages_text": null,
    "public_contact_available": true,
    "profile_completeness_score": 65,
    "profile_quality_label": "medium",
    "reason_tags": "has_public_email,has_linkedin,has_github,public_profile",
    "page_title": "Lee Robinson",
    "page_text_snippet": null,
    "input_index": 19,
    "scraped_at": "2026-06-07T12:33:41.035Z"
}
```

***

### 🎯 Profile-completeness score

Transparent rule-based score (0–100) computed from extracted fields — no AI, no external enrichment.

| Signal                                                           | Points |
| ---------------------------------------------------------------- | -----: |
| `candidate_name` present                                         |    +15 |
| `headline` or `current_title` present                            |    +15 |
| `current_company` present                                        |    +10 |
| `location_text` present                                          |    +10 |
| at least one public contact (`email` or `phone`)                 |    +15 |
| any profile link (`linkedin` / `github` / `portfolio` / website) |    +10 |
| `skill_count >= 3`                                               |    +10 |
| `experience_summary` present                                     |    +10 |
| `education_summary` or `certifications_text` present             |     +5 |

Score is capped at 100.

**Labels**: `high` (70–100) · `medium` (40–69) · `low` (0–39).

`reason_tags` is a comma-separated list explaining the row — e.g. `has_public_email`, `has_public_phone`, `has_linkedin`, `has_github`, `has_portfolio`, `has_skills`, `has_experience`, `has_education`, `resume_pdf`, `public_profile`, `low_information`, plus `keyword_match` / `location_match` when your filters matched.

***

### ⚙️ Filters

| Filter                 | Effect                                                                                                 |
| ---------------------- | ------------------------------------------------------------------------------------------------------ |
| `requiredKeywords`     | Keep only rows whose visible text or detected skills contain at least one keyword. Missing text fails. |
| `locationIncludes`     | Keep only rows whose `location_text` contains one of the values. Missing location fails (when set).    |
| `minCompletenessScore` | Keep only rows scoring at or above the threshold (0–100).                                              |
| `deduplicate`          | Drop duplicates by email, canonical/profile URL, or name + source; the richer duplicate is kept.       |

Filters are applied **after extraction** and **before** any dataset push or event charge. Filtered-out rows are counted in `filtered_out` and never charged.

***

### 💰 Pricing

**Pay-Per-Event**. One flat event per saved row (final per-event price is configured on the Apify console):

| Event              | Charged when                                                                                       |
| ------------------ | -------------------------------------------------------------------------------------------------- |
| `candidate-result` | Once per unique candidate row that passed all filters and was successfully written to the dataset. |

So your bill is simply `results_saved × price_per_event`. The actor honors the user-configured per-run spending cap (Apify `eventChargeLimitReached`): it caps how many results it collects up-front to what the limit can pay for, and stops cleanly the moment the cap is reached during charging.

Not charged:

- Failed inputs and blocked/transient errors.
- Pages skipped because they require login / cookies / private access.
- Duplicates (by email, canonical/profile URL, name + source).
- Rows filtered out by `requiredKeywords` / `locationIncludes` / `minCompletenessScore`.
- Pure low-information / error rows (no useful candidate signal).

#### 🚦 Proxy policy

Use **Apify Datacenter** proxy or **no proxy** for normal runs — both work for public resume/profile pages at this actor's conservative concurrency.

**Apify Residential proxy is not supported.** The actor will fail at startup if `proxyConfiguration.apifyProxyGroups` includes `RESIDENTIAL`. Reason: in pay-per-event actors, residential bandwidth (~/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.

If you genuinely need residential routing, supply your own residential provider via the proxy editor's **Custom proxy URLs** field — that traffic goes through your provider, not Apify, and is unaffected:

```
http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777
```

***

### 📊 Run summary

After each run, a `RUN_SUMMARY` entry is written to the key-value store:

```json
{
    "inputs_total": 20,
    "successful_inputs": 20,
    "failed_inputs": 0,
    "skipped_private_or_login_required": 0,
    "raw_results_found": 20,
    "results_saved": 19,
    "duplicates_removed": 1,
    "filtered_out": 0,
    "charged_events": 19,
    "charge_failures": 0,
    "blocked_requests": 0,
    "retry_count": 0,
    "pdfs_processed": 0,
    "pdfs_skipped": 0,
    "html_pages_processed": 20,
    "runtime_seconds": 12,
    "scraped_at": "2026-06-07T12:33:45.708Z"
}
```

`charged_events` equals the number of successfully saved unique candidate rows.

***

### 🚧 Limitations (V1)

- **Public data only**: no login, cookies, sessions, or member-only content. Pages behind an auth/login wall, paywall, or captcha are skipped (counted in `skipped_private_or_login_required`), never bypassed.
- **HTTP-first**: HTML + directly public PDF/text resumes. No browser automation in V1 (a future opt-in), no media/image downloads, and no crawling beyond the URLs you provide.
- **Visible-only contacts**: `email` / `phone` are extracted only when publicly visible (mailto/tel links, structured data, or visible text). No enrichment, verification, or append.
- **No AI**: skills come from a static dictionary plus your `skillKeywords`; the completeness score is rule-based.
- **PDF caps**: PDFs over 10 MB are skipped; extracted text is truncated for memory safety. Only structured fields are stored — not full document text.

***

### ❓ FAQ

**Do I need any account, cookie, or API key?**
No. The actor only fetches public URLs over HTTP. No usernames, passwords, cookies, authorization headers, session tokens, or paid people-data vendor keys are accepted.

**Which URLs work best?**
Public personal sites / "about" pages, public portfolios, public GitHub profiles, and directly public PDF/text resumes. Private resume databases and logged-in LinkedIn/Indeed pages are out of scope.

**Why are some fields empty?**
Fields populate only when the value is visibly present on the page or in the PDF text. Missing scalars are `null`, missing lists are `""`.

**How is `profile_completeness_score` computed?**
A transparent rule-based sum (see above) — no AI. Use it with `minCompletenessScore` to keep only richer profiles.

**Can I export to CSV?**
Yes — every field is flat (no nested objects). Use Apify's CSV / Excel export, or the dataset API with `format=csv`.

***

### 🛠️ Technical notes

- **Stack**: Node.js 22 · Apify SDK 3 · Crawlee `HttpCrawler` · Cheerio (HTML) · `unpdf` (public PDF text). No browser.
- **Concurrency**: `min=1`, `max=10` (conservative; tune after real runs).
- **Memory**: 1 GB min · 2 GB default · 4 GB max.
- **Proxy**: Apify Proxy enabled by default; custom proxy URLs accepted; Apify Residential rejected at startup.
- **Reliability**: session rotation, realistic headers, and retry/backoff on transient `429`/`5xx`. Auth walls and `401`/`403` are skipped without retry.

# Actor input Schema

## `startUrls` (type: `array`):

Public resume, portfolio, GitHub, or profile URLs to process. Each URL is fetched once and turned into one candidate record. Public pages only — login-required, cookie-required, or private pages are skipped, never bypassed.

## `sourceType` (type: `string`):

Optional hint about the page type. Affects parser priority only, never access rules. Leave on Auto-detect for mixed lists.

## `maxResults` (type: `integer`):

Maximum number of saved unique candidate rows across the whole run. Range 1–5000.

## `includePdfText` (type: `boolean`):

Extract text from directly public PDF resumes (no login). PDFs over 10 MB are skipped. Turn off to skip PDFs entirely.

## `includePageTextSnippet` (type: `boolean`):

Add a short visible-text snippet (under 500 chars) per row for audit/debugging. Off by default to keep rows lean.

## `skillKeywords` (type: `array`):

Extra skills/keywords to detect in visible text (in addition to the built-in skill dictionary). Matched skills appear in skills\_detected and matched\_keywords. Max 200.

## `requiredKeywords` (type: `array`):

Save only candidates whose visible text or detected skills contain at least one of these keywords. Leave empty for no keyword filter. Max 100.

## `locationIncludes` (type: `array`):

Save only candidates whose location text contains at least one of these values (case-insensitive). Rows with no detected location fail this filter only when it is set. Max 50.

## `minCompletenessScore` (type: `integer`):

Save only candidates whose profile\_completeness\_score (0–100) is at least this value. 0 saves all.

## `deduplicate` (type: `boolean`):

Remove duplicate candidate rows by email, canonical/profile URL, or name + source. The richer of any duplicate pair is kept. Recommended ON.

## `proxyConfiguration` (type: `object`):

Apify Proxy configuration. Defaults to Apify Proxy enabled. Apify Residential is NOT supported and will fail the run at startup; if you need residential routing, supply your own provider via Custom proxy URLs (proxyUrls).

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://github.com/torvalds"
    },
    {
      "url": "https://github.com/gaearon"
    },
    {
      "url": "https://github.com/sindresorhus"
    }
  ],
  "sourceType": "auto",
  "maxResults": 100,
  "includePdfText": true,
  "includePageTextSnippet": false,
  "skillKeywords": [
    "Python",
    "SQL",
    "React",
    "AWS"
  ],
  "requiredKeywords": [],
  "locationIncludes": [],
  "minCompletenessScore": 0,
  "deduplicate": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

Flat, CSV-friendly table of every candidate row pushed to the dataset — identity, public contact (visible-only), profile links, source tracking, detected skills, and the profile-completeness score.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://github.com/torvalds"
        },
        {
            "url": "https://github.com/gaearon"
        },
        {
            "url": "https://github.com/sindresorhus"
        }
    ],
    "skillKeywords": [
        "Python",
        "SQL",
        "React",
        "AWS"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("coregent/resume-candidate-profile-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [
        { "url": "https://github.com/torvalds" },
        { "url": "https://github.com/gaearon" },
        { "url": "https://github.com/sindresorhus" },
    ],
    "skillKeywords": [
        "Python",
        "SQL",
        "React",
        "AWS",
    ],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("coregent/resume-candidate-profile-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://github.com/torvalds"
    },
    {
      "url": "https://github.com/gaearon"
    },
    {
      "url": "https://github.com/sindresorhus"
    }
  ],
  "skillKeywords": [
    "Python",
    "SQL",
    "React",
    "AWS"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call coregent/resume-candidate-profile-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=coregent/resume-candidate-profile-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Resume / Candidate Profile Scraper",
        "description": "Extract structured candidate data from public resume, portfolio, GitHub, and profile URLs into flat, CSV-ready rows with skills, visible contacts, profile links, and a completeness score — no login, cookies, or residential proxy.",
        "version": "1.0",
        "x-build-id": "I1P41upLq1kKeSdsT"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/coregent~resume-candidate-profile-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-coregent-resume-candidate-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/coregent~resume-candidate-profile-scraper/runs": {
            "post": {
                "operationId": "runs-sync-coregent-resume-candidate-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/coregent~resume-candidate-profile-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-coregent-resume-candidate-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Public resume / profile URLs",
                        "type": "array",
                        "description": "Public resume, portfolio, GitHub, or profile URLs to process. Each URL is fetched once and turned into one candidate record. Public pages only — login-required, cookie-required, or private pages are skipped, never bypassed.",
                        "default": [],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "sourceType": {
                        "title": "Source type hint",
                        "enum": [
                            "auto",
                            "resume_pdf",
                            "resume_html",
                            "portfolio",
                            "public_profile",
                            "github_profile",
                            "personal_site"
                        ],
                        "type": "string",
                        "description": "Optional hint about the page type. Affects parser priority only, never access rules. Leave on Auto-detect for mixed lists.",
                        "default": "auto"
                    },
                    "maxResults": {
                        "title": "Max results",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of saved unique candidate rows across the whole run. Range 1–5000.",
                        "default": 100
                    },
                    "includePdfText": {
                        "title": "Parse public PDF resumes",
                        "type": "boolean",
                        "description": "Extract text from directly public PDF resumes (no login). PDFs over 10 MB are skipped. Turn off to skip PDFs entirely.",
                        "default": true
                    },
                    "includePageTextSnippet": {
                        "title": "Include page text snippet",
                        "type": "boolean",
                        "description": "Add a short visible-text snippet (under 500 chars) per row for audit/debugging. Off by default to keep rows lean.",
                        "default": false
                    },
                    "skillKeywords": {
                        "title": "Skill keywords",
                        "type": "array",
                        "description": "Extra skills/keywords to detect in visible text (in addition to the built-in skill dictionary). Matched skills appear in skills_detected and matched_keywords. Max 200.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "requiredKeywords": {
                        "title": "Required keywords",
                        "type": "array",
                        "description": "Save only candidates whose visible text or detected skills contain at least one of these keywords. Leave empty for no keyword filter. Max 100.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "locationIncludes": {
                        "title": "Location includes",
                        "type": "array",
                        "description": "Save only candidates whose location text contains at least one of these values (case-insensitive). Rows with no detected location fail this filter only when it is set. Max 50.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "minCompletenessScore": {
                        "title": "Minimum completeness score",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Save only candidates whose profile_completeness_score (0–100) is at least this value. 0 saves all.",
                        "default": 0
                    },
                    "deduplicate": {
                        "title": "Deduplicate candidates",
                        "type": "boolean",
                        "description": "Remove duplicate candidate rows by email, canonical/profile URL, or name + source. The richer of any duplicate pair is kept. Recommended ON.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify Proxy configuration. Defaults to Apify Proxy enabled. Apify Residential is NOT supported and will fail the run at startup; if you need residential routing, supply your own provider via Custom proxy URLs (proxyUrls).",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
