You can access the Fuzzy Search Dataset Actor programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

{
  "openapi": "3.0.1",
  "info": {
    "version": "0.0",
    "x-build-id": "ba7C4B3zrI1wsI9GW"
  },
  "servers": [
    {
      "url": "https://api.apify.com/v2"
    }
  ],
  "paths": {
    "/acts/dtrungtin~fuzzy-search-dataset-actor/run-sync-get-dataset-items": {
      "post": {
        "operationId": "run-sync-get-dataset-items-dtrungtin-fuzzy-search-dataset-actor",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK"
          }
        }
      }
    },
    "/acts/dtrungtin~fuzzy-search-dataset-actor/runs": {
      "post": {
        "operationId": "runs-sync-dtrungtin-fuzzy-search-dataset-actor",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor and returns information about the initiated run in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/runsResponseSchema"
                }
              }
            }
          }
        }
      }
    },
    "/acts/dtrungtin~fuzzy-search-dataset-actor/run-sync": {
      "post": {
        "operationId": "run-sync-dtrungtin-fuzzy-search-dataset-actor",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK"
          }
        }
      }
    }
  },
  "components": {
    "schemas": {
      "inputSchema": {
        "type": "object",
        "required": [
          "datasetId",
          "query"
        ],
        "properties": {
          "datasetId": {
            "title": "Dataset ID",
            "type": "string",
            "description": "The Apify dataset ID that contains the records you want to search. You can find this in the dataset URL or in Apify Console. Example: 'abc123DEF456'."
          },
          "query": {
            "title": "Search Query",
            "type": "string",
            "description": "The text users want to search for. Fuzzy search allows typos and partial matches. Example: searching for 'iphon pro' can still match 'iPhone 15 Pro Max'."
          },
          "fields": {
            "title": "Search Fields",
            "type": "array",
            "description": "List of dataset fields to search in. You can search multiple fields at the same time. Examples: 'title', 'description', 'brand', 'category', or nested fields like 'product.name'.",
            "items": {
              "type": "string"
            },
            "default": [
              "title"
            ]
          },
          "limit": {
            "title": "Maximum Results",
            "minimum": 1,
            "maximum": 1000,
            "type": "integer",
            "description": "Maximum number of search results returned. Lower values improve performance. Recommended: 10 to 100.",
            "default": 20
          },
          "threshold": {
            "title": "Fuzzy Match Strictness",
            "minimum": 0,
            "maximum": 1,
            "type": "number",
            "description": "Controls how strict the fuzzy search is. Lower values require more accurate matches. Higher values allow more typos and loose matching. Recommended values: 0.2 = strict, 0.35 = balanced, 0.6 = very loose.",
            "default": 0.35
          },
          "ignoreLocation": {
            "title": "Ignore Word Position",
            "type": "boolean",
            "description": "When enabled, matches can appear anywhere inside the text. Example: searching 'pro' can match both 'iPhone Pro' and 'Professional Camera'. Recommended: enabled.",
            "default": true
          },
          "minMatchCharLength": {
            "title": "Minimum Match Length",
            "minimum": 1,
            "type": "integer",
            "description": "Minimum number of characters required before fuzzy considers a match. Helps reduce noisy results for very short queries. Recommended: 2 or 3.",
            "default": 2
          },
          "includeScore": {
            "title": "Include Relevance Score",
            "type": "boolean",
            "description": "Adds a relevance score to each result. Lower scores mean better matches. Useful for debugging, sorting, or displaying search confidence.",
            "default": true
          },
          "includeMatches": {
            "title": "Include Match Details",
            "type": "boolean",
            "description": "Returns detailed information about which text fragments matched the query. Useful for highlighting matched keywords in a frontend application.",
            "default": false
          },
          "extendedSearch": {
            "title": "Enable Advanced Search Syntax",
            "type": "boolean",
            "description": "Enables Fuzzy advanced query syntax. Examples: '^apple' = starts with apple, '!samsung' = exclude samsung, '=iphone' = exact match. Recommended for advanced users only.",
            "default": false
          },
          "weights": {
            "title": "Field Importance Weights",
            "type": "object",
            "description": "Optional JSON object that controls which fields are more important during ranking. Higher weight means higher priority in search results. Example: {\"title\": 0.7, \"description\": 0.2, \"brand\": 0.1}"
          }
        }
      },
      "runsResponseSchema": {
        "type": "object",
        "properties": {
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "type": "string"
              },
              "actId": {
                "type": "string"
              },
              "userId": {
                "type": "string"
              },
              "startedAt": {
                "type": "string",
                "format": "date-time",
                "example": "2025-01-08T00:00:00.000Z"
              },
              "finishedAt": {
                "type": "string",
                "format": "date-time",
                "example": "2025-01-08T00:00:00.000Z"
              },
              "status": {
                "type": "string",
                "example": "READY"
              },
              "meta": {
                "type": "object",
                "properties": {
                  "origin": {
                    "type": "string",
                    "example": "API"
                  },
                  "userAgent": {
                    "type": "string"
                  }
                }
              },
              "stats": {
                "type": "object",
                "properties": {
                  "inputBodyLen": {
                    "type": "integer",
                    "example": 2000
                  },
                  "rebootCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "restartCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "resurrectCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "computeUnits": {
                    "type": "integer",
                    "example": 0
                  }
                }
              },
              "options": {
                "type": "object",
                "properties": {
                  "build": {
                    "type": "string",
                    "example": "latest"
                  },
                  "timeoutSecs": {
                    "type": "integer",
                    "example": 300
                  },
                  "memoryMbytes": {
                    "type": "integer",
                    "example": 1024
                  },
                  "diskMbytes": {
                    "type": "integer",
                    "example": 2048
                  }
                }
              },
              "buildId": {
                "type": "string"
              },
              "defaultKeyValueStoreId": {
                "type": "string"
              },
              "defaultDatasetId": {
                "type": "string"
              },
              "defaultRequestQueueId": {
                "type": "string"
              },
              "buildNumber": {
                "type": "string",
                "example": "1.0.0"
              },
              "containerUrl": {
                "type": "string"
              },
              "usage": {
                "type": "object",
                "properties": {
                  "ACTOR_COMPUTE_UNITS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_WRITES": {
                    "type": "integer",
                    "example": 1
                  },
                  "KEY_VALUE_STORE_LISTS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_INTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_SERPS": {
                    "type": "integer",
                    "example": 0
                  }
                }
              },
              "usageTotalUsd": {
                "type": "number",
                "example": 0.00005
              },
              "usageUsd": {
                "type": "object",
                "properties": {
                  "ACTOR_COMPUTE_UNITS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_WRITES": {
                    "type": "number",
                    "example": 0.00005
                  },
                  "KEY_VALUE_STORE_LISTS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_INTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_SERPS": {
                    "type": "integer",
                    "example": 0
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Fuzzy Search Dataset Actor OpenAPI definition

OpenAPI is a standard for designing and describing RESTful APIs, allowing developers to define API structure, endpoints, and data formats in a machine-readable way. It simplifies API development, integration, and documentation.

OpenAPI is effective when used with AI agents and GPTs by standardizing how these systems interact with various APIs, for reliable integrations and efficient communication.

By defining machine-readable API specifications, OpenAPI allows AI models like GPTs to understand and use varied data sources, improving accuracy. This accelerates development, reduces errors, and provides context-aware responses, making OpenAPI a core component for AI applications.

You can download the OpenAPI definitions for Fuzzy Search Dataset Actor from the options below:

OpenAPI.json

If you’d like to learn more about how OpenAPI powers GPTs, read our blog post.

You can also check out our other API clients:

Fuzzy Search Dataset Actor API in Python

Fuzzy Search Dataset Actor API in JavaScript

Fuzzy Search Dataset Actor API through CLI

Fuzzy Search Dataset Actor API

CRM Deduplication Tool

enosgb/crm-deduplication-tool

Detects and merges duplicate contacts in CRM databases using advanced fuzzy matching algorithms

Enos Melo

HubSpot Company Enrichment & Fuzzy Matcher for Clay

alizarin_refrigerator-owner/hubspot-company-enrichment-fuzzy-matcher-for-clay

Fuzzy match and enrich companies against your HubSpot CRM using multi-signal matching (domain, company name, phone, location). Returns HubSpot ID, lifecycle stage, deal status & confidence scores. Perfect for Clay workflows, lead deduplication, and outbound enrichment.

The Howlers

Content Similarity Finder

fiery_dream/content-similarity-finder

Find duplicate and similar content with advanced fuzzy matching algorithms. Perfect for data cleaning and deduplication.

Cody Churchwell

Dataset Download

idiatech/apify-Dataset-Download

Download any dataset from the Apify platform automatically and in any format you want. Use this actor along with a Dataset toolbox automation tool.

idIA Tech

OFAC Sanctions List Search — SDN Screening with Fuzzy Matching

ryanclinton/ofac-sanctions-search

Search the US Treasury OFAC SDN sanctions list for KYC compliance screening. Screen individuals, entities, vessels & aircraft with fuzzy name matching. Filter by sanctions program and country. Returns aliases, IDs, addresses & direct OFAC links.

Ryan Clinton

Data.gov.uk Scraper - Cheap 🌐📊🇬🇧

scrapestorm/data-gov-uk-scraper---cheap

🔎 Easily collect dataset listings from data.gov.uk Provide one or multiple search URLs and extract dataset information such as 📄 Dataset Title 🏢 Published By 🕒 Last Updated 📝 Description 🔗 Dataset URL & more Perfect for open data research, government data monitoring & dataset discovery 📊🚀

Storm_Scraper

5.0

AI Prompt Keyword Matcher

antonio_espresso/ai-prompt-keyword-matcher

Analyze prompts for fuzzy keyword matches and brand token usage.

Antonio Blago

Data.gov.uk Scraper - Low-cost💲🔥📚🇬🇧

delectable_incubator/data-gov-uk-scraper-low-cost

Scrape data.gov.uk dataset listings 🔎📊 with a powerful open data scraper. Extract dataset titles, publishers, update dates, descriptions, tags, and dataset URLs from search results. Ideal for government data monitoring, open data research, dataset discovery, and structured data catalog creation 🚀

Prime Scrape

Product Matching Vectorizer

tri_angle/product-matching-vectorizer

Builds a FAISS vector database from products in an Apify dataset using an ONNX embedding model. The resulting index is saved to a Key-Value Store for fast similarity search. After uploading your dataset to the vector database, use our E-commerce Product Matching Tool to find matching products.