expressleasing-machine-search
Under maintenancePricing
from $0.00005 / actor start
expressleasing-machine-search
Under maintenanceCrawler for construction machinery using Apify, Airtable and OpenAI API, which returns most relevant offer links from WHITELISTED suppliers, and parameters like brand, model, price, year etc. Airtable serves as a search + for gathering results. Crawler was built in Apify (Javascript + Playground)
Pricing
from $0.00005 / actor start
Rating
0.0
(0)
Developer
DARIUSZ SUCHY
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
a day ago
Last modified
Categories
Share
ExpressLeasing Machine Search Actor
Project purpose
This Actor automates the machine search process for ExpressLeasing client requests.
Business flow:
-
The user enters a search phrase in Airtable, for example:
Kubota KX027 up to 120,000 PLN netkubota kx016 price from 80000 to 120000 plnmini excavator 2.5-3 tons up to 130,000 PLN netJCB 3CX up to 180,000 PLN netManitou telehandler
-
Airtable triggers the Apify Actor.
-
The Actor reads the supplier list and inventory URLs from Airtable.
-
The Actor searches supplier inventory pages using Playwright.
-
The Actor identifies candidate listings, scores them, and writes the results back to Airtable.
-
Optionally, the Actor uses OpenAI to rerank the best candidates.
-
Final results appear in the
Wyniki wyszukiwaniatable in Airtable.
The MVP goal is to reduce manual browsing across many supplier websites and return the best matching offers and alternatives in one place.
Current MVP scope
The Actor currently supports:
-
reading a search request from Airtable,
-
reading suppliers from Airtable,
-
cleaning supplier inventory URLs,
-
skipping contact-page URLs,
-
searching inventory pages with Playwright,
-
detecting likely offer links,
-
opening offer detail pages,
-
extracting basic offer data:
- title,
- offer URL,
- supplier,
- net price,
- brand,
- model,
- production year,
- operating hours / mth,
- machine weight in kg,
-
rule-based scoring,
-
result classification:
Dokładne,Alternatywa,Słabe,
-
supplier filtering by tags,
-
optional OpenAI reranking,
-
writing results back to Airtable.
Architecture
Airtable├── Wyszukiwania maszyn├── Dostawcy└── Wyniki wyszukiwania↓Airtable Automation↓Apify Actor↓Playwright / Chromium↓Supplier websites / Otomoto / OLX / Machineryline / inventory pages↓Rule-based scoring↓Optional OpenAI reranking↓Airtable: Wyniki wyszukiwania
Main Airtable tables
1. Wyszukiwania maszyn
This is the table where the user enters search requests.
Required fields:
| Field | Type | Description |
|---|---|---|
Nazwa | Text / formula | Search record name |
Record ID | Formula | RECORD_ID() |
Fraza wyszukiwania | Long text | Client search phrase |
Status | Single select | Robocze, Do wyszukania, W trakcie, Gotowe, Błąd |
Limit źrodeł / Limit źródeł | Number | Number of supplier sources to check |
Limit wyników | Number | Number of final results to save |
Max stron na źródło | Number | Number of inventory pages to scan per source |
Parametry AI | Long text | Recognized query parameters and supplier filter data |
Podsumowanie | Long text | Run summary |
Liczba wyników | Number | Number of saved results |
Run ID | Text | Apify run ID |
Data startu | Date/time | Run start time |
Data zakończenia | Date/time | Run end time |
Błąd | Long text | Error message |
Setting Status = Do wyszukania starts the search automation.
2. Dostawcy
This table contains suppliers and their inventory links.
Required fields:
| Field | Type | Description |
|---|---|---|
Name | Text | Supplier name |
Kod dostawcy | Text | Supplier code |
Produkty | Linked records / text | Related products or notes |
Link do dostawcy 1 | URL | Supplier inventory URL |
Tagi wyszukiwania | Multiple select | Supplier category and brand tags |
The field Link do dostawcy 1 should point to an inventory or offer-listing page, not to a contact page.
Good examples:
https://example.otomoto.pl/inventoryhttps://example.pl/maszyny-uzywanehttps://www.olx.pl/oferty/uzytkownik/...
Weak examples:
https://example.pl/kontakthttps://example.pl/o-nas
3. Wyniki wyszukiwania
This table stores the results returned by the Actor.
Required fields:
| Field | Type | Description |
|---|---|---|
Tytuł oferty | Text | Offer title |
Wyszukiwanie | Linked record | Link to the search record |
Score | Number | Rule-based score |
Typ wyniku | Single select | Dokładne, Alternatywa, Słabe |
Score AI | Number | OpenAI score |
Typ wyniku AI | Single select | Dokładne, Alternatywa, Słabe, Odrzucone |
Powód AI | Long text | OpenAI reasoning |
OpenAI użyte | Checkbox | Whether OpenAI was used for this result |
Dostawca | Linked record | Link to supplier |
Link do oferty | URL | Direct offer URL |
Cena netto | Currency / number | Net price, if detected |
Marka | Text | Detected brand |
Model | Text | Detected model |
Rok produkcji | Number | Detected production year |
Przebieg / mth | Number | Operating hours |
Masa kg | Number | Machine weight in kg |
Powód dopasowania | Long text | Rule-based matching reason |
Braki danych | Long text | Missing data |
Status wyniku | Single select | Do sprawdzenia, Dobry, Odrzucony |
Data znalezienia | Date/time | Result creation time |
Supplier tags
The field:
Tagi wyszukiwania
is used to reduce the number of supplier sources that must be searched.
Example category tags:
#minikoparki#koparki#koparki_kolowe#koparki_gasienicowe#koparko_ladowarki#ladowarki#ladowarki_teleskopowe#ladowarki_kolowe#mini_ladowarki#wozki_widlowe#podnosniki#walce#wozidla#wywrotki#zurawie_dekarskie#rolnicze#mix
Example brand tags:
#kubota#jcb#takeuchi#yanmar#bobcat#caterpillar#komatsu#volvo#manitou#merlo#liebherr#wacker_neuson#sany#hitachi#doosan#develon#case#new_holland#mecalac#kramer#hamm#bomag#ammann#dynapac#toyota#linde#still#unicarriers
Rules:
- tags describe the supplier’s actual specialization,
- not every supplier needs to be tagged immediately,
- broad suppliers can be tagged with
#mix, - if there are too few matching tags, the Actor falls back to searching a broader supplier set.
Actor input
The Actor requires one input value:
{"searchRecordId": "recXXXXXXXXXXXXXX"}
searchRecordId is the technical Airtable record ID from the Wyszukiwania maszyn table.
It can be exposed in Airtable using a formula field:
RECORD_ID()
Environment variables
Required
| Name | Example | Secret | Description |
|---|---|---|---|
AIRTABLE_PAT | pat... | Yes | Airtable Personal Access Token |
AIRTABLE_BASE_ID | app... | No | Airtable base ID |
SEARCHES_TABLE | Wyszukiwania maszyn | No | Search table name |
SUPPLIERS_TABLE | Dostawcy | No | Supplier table name |
RESULTS_TABLE | Wyniki wyszukiwania | No | Results table name |
Airtable field names
| Name | Default value | Description |
|---|---|---|
SUPPLIER_NAME_FIELD | Name | Supplier name field |
SUPPLIER_CODE_FIELD | Kod dostawcy | Supplier code field |
INVENTORY_URL_FIELD | Link do dostawcy 1 | Supplier inventory URL field |
SUPPLIER_TAGS_FIELD | Tagi wyszukiwania | Supplier tags field |
SEARCH_QUERY_FIELD | Fraza wyszukiwania | Search phrase field |
Runtime limits
| Name | Default value | Description |
|---|---|---|
MIN_TAGGED_SOURCES | 12 | Minimum number of sources after tag filtering |
MAX_DETAILS_PER_SOURCE | 8 | Maximum number of offer details opened per supplier source |
Recommended test setting:
MAX_DETAILS_PER_SOURCE = 4 or 5
Lower values reduce runtime and cost.
OpenAI
| Name | Example | Secret | Description |
|---|---|---|---|
OPENAI_API_KEY | sk-... | Yes | OpenAI API key |
USE_OPENAI_RERANK | true | No | Enables OpenAI reranking |
OPENAI_MODEL | gpt-5.4-mini | No | Model used for reranking |
OPENAI_RERANK_LIMIT | 20 | No | Number of candidates sent to OpenAI |
OPENAI_FINAL_RESULTS | 5 | No | Final number of results after reranking |
If OpenAI fails, the Actor should still complete successfully and fall back to the rule-based ranking.
How to run a search
From Airtable
- Open the
Wyszukiwania maszyntable. - Create a new search record.
- Enter
Fraza wyszukiwania, for example:
kubota kx016 cena od 80000 do 120000 pln
- Set search parameters:
Limit źrodeł = 30Limit wyników = 20Max stron na źródło = 1
- Set:
Status = Do wyszukania
- Airtable Automation triggers the Apify Actor.
- When the Actor finishes, it updates:
Status = GotoweLiczba wyników = number of saved resultsPodsumowanie = run summary
- Results appear in the
Wyniki wyszukiwaniatable.
Recommended test settings
Quick test
Limit źrodeł = 10Limit wyników = 5Max stron na źródło = 1MAX_DETAILS_PER_SOURCE = 4OPENAI_RERANK_LIMIT = 10
MVP test
Limit źrodeł = 30Limit wyników = 10-20Max stron na źródło = 1MAX_DETAILS_PER_SOURCE = 4-5OPENAI_RERANK_LIMIT = 15-20
Broader test
Limit źrodeł = 50Limit wyników = 5-20Max stron na źródło = 1MAX_DETAILS_PER_SOURCE = 4-5OPENAI_RERANK_LIMIT = 20
Do not increase Max stron na źródło to 2 until the first-page results are reliable.
How scoring works
The Actor uses two scoring layers.
1. Rule-based scoring
The rule-based scoring evaluates:
- exact model match,
- brand match,
- machine category,
- weight class,
- budget,
- production year,
- operating hours,
- title and description similarity,
- known model catalog.
Example:
Kubota KX016-4
for the query:
kubota kx016 cena od 80000 do 120000 pln
should be classified as Dokładne.
Possible alternatives:
Kubota KX018-4Kubota KX019-4JCB 8018Takeuchi TB216Yanmar SV17
may be classified as Alternatywa if they are in a similar weight class.
2. OpenAI reranking
If USE_OPENAI_RERANK = true, the Actor sends the best candidates to OpenAI.
OpenAI evaluates:
- whether the offer is in the correct machine category,
- whether it is the exact model,
- whether it is a reasonable alternative,
- whether the result is clearly from the wrong category,
- how to justify the match.
OpenAI results are written to:
Score AITyp wyniku AIPowód AIOpenAI użyte
Supported categories
The rule-based logic recognizes, among others:
minikoparkimidikoparkikoparko_ladowarkikoparki_kolowekoparki_gasienicoweladowarki_teleskopoweladowarki_kolowemini_ladowarkiwozki_widlowepodnosnikiwalcewozidlawywrotkizurawie_dekarskierolniczekruszarki_przesiewacze
Example test queries
kubota kx016 cena od 80000 do 120000 pln
Kubota KX027 do 120 000 zł netto
minikoparka 2,5-3 tony do 130 000 zł netto
JCB 3CX do 180 000 zł netto
ładowarka teleskopowa JCB 540-140
ładowarka teleskopowa Manitou 14m
koparka kołowa 15 ton
walec 2-3 tony
wózek widłowy LPG 2,5 tony
Expected Airtable output
Example result table:
| Score | Typ wyniku | Score AI | Typ wyniku AI | Offer | Supplier | URL |
|---|---|---|---|---|---|---|
| 94 | Dokładne | 96 | Dokładne | Kubota KX016-4 | Supplier X | URL |
| 78 | Alternatywa | 84 | Alternatywa | JCB 8018 | Supplier Y | URL |
| 72 | Alternatywa | 81 | Alternatywa | Takeuchi TB216 | Supplier Z | URL |
Known limitations
- The Actor does not use official marketplace APIs.
- The crawler works through Playwright, so runtime depends on website speed.
- Some websites may block automation, load offers dynamically, or change their HTML.
- Price, year, weight, and operating hours may not always be extracted correctly.
Cena nettomay be empty if the price is not available in a readable format.- OpenAI reranking improves result quality but does not reduce scraping time.
- Supplier tags improve speed and relevance only after they are filled in.
- Rerunning the same search record may create duplicate results.
Performance and cost
Observed benchmark from an earlier version:
50 sources1 page per source5 resultsaround 33 minutesaround 0.45 USD
Main runtime drivers:
- number of sources,
Max stron na źródło,MAX_DETAILS_PER_SOURCE,- supplier website speed,
- number of offer detail pages opened,
- whether OpenAI reranking is enabled.
The simplest optimization:
MAX_DETAILS_PER_SOURCE = 4 or 5
Larger future optimization:
global candidate ranking before opening offer detail pages
Troubleshooting
Brak searchRecordId w input Actora
The Actor did not receive the Airtable search record ID.
Check the input:
{"searchRecordId": "recXXXXXXXXXXXXXX"}
Also check the Airtable Automation input variable recordId.
Brak AIRTABLE_PAT
The required environment variable is missing:
AIRTABLE_PAT
Add it in Apify as a secret.
Airtable API error 401 / 403
This usually means an Airtable token or permission issue.
Check:
AIRTABLE_PATAIRTABLE_BASE_IDtoken access to the baseread/write permissions
Unknown field name
Airtable field names do not match the code.
Check especially:
Fraza wyszukiwaniaLink do dostawcy 1Tagi wyszukiwaniaWyniki wyszukiwaniaScore AITyp wyniku AIPowód AIOpenAI użyte
OpenAI reranking failed
The Actor should fall back to rule-based ranking.
Check:
OPENAI_API_KEYOPENAI_MODELUSE_OPENAI_RERANKmodel availabilityAPI limits
If the model is not available, change:
OPENAI_MODEL
to a model available in the API project.
Results are from the wrong category
Example: a wheel loader appears for a mini excavator query.
Possible causes:
- missing model alias,
- ambiguous offer title,
- OpenAI reranking disabled,
- too little data on the offer page,
- supplier has no tags.
Actions:
- Check
Typ wyniku AI. - Check
Powód AI. - Check
Powód dopasowania. - Add missing models or aliases to the catalog.
- Add supplier tags.
Too few results
Actions:
- Increase
Limit źrodeł. - Increase
MAX_DETAILS_PER_SOURCE. - Increase
OPENAI_RERANK_LIMIT. - Try
Max stron na źródło = 2, but only as a test. - Check whether supplier tag filtering is too narrow.
Run takes too long
Actions:
- Reduce
MAX_DETAILS_PER_SOURCEto4. - Use supplier tags.
- Reduce
Limit źrodeł. - Avoid increasing
Max stron na źródłounless necessary. - Future improvement: implement global candidate ranking before opening detail pages.
Operational best practices
-
Create a new
Wyszukiwania maszynrecord for each test. -
Do not rerun the same record repeatedly without clearing old results.
-
Start tests with
Limit źrodeł = 10-15. -
Run broader tests only after the smaller tests succeed.
-
Manually classify results as:
Dobry,Do sprawdzenia,Odrzucony.
-
Review incorrect results and improve the model catalog or supplier tags.
Roadmap
Stage 1 — current MVP
- live search across known sources,
- rule-based scoring,
- supplier tags,
- OpenAI reranking,
- results in Airtable.
Stage 2 — runtime optimization
- global candidate ranking,
- opening detail pages only for the best candidates,
- limited concurrency,
- lower run cost.
Stage 3 — better query interpretation
- OpenAI query parser,
- automatic category, brand, model, budget, weight, year, and hours extraction,
- automatic alternative model expansion.
Stage 4 — offer cache / offer index
- scheduled inventory crawling,
- local
Znalezione ofertytable, - fast search over already collected offers,
- live search only as fallback.
Stage 5 — result handling automation
- mark result as
Dobry, - create supplier contact task,
- prepare supplier availability message,
- prepare a client proposal draft.
Project status
The project is currently in a usable MVP stage.
The current scope satisfies the target flow:
The user enters a search phrase in Airtable.The system searches supplier sources.After several to several dozen minutes, the user receives top offers.Most top results are from the correct category.Exact models and close alternatives rank above random machines.
Current priorities:
1. Test OpenAI reranking across several machine categories.2. Set MAX_DETAILS_PER_SOURCE to 4 or 5.3. Fill supplier tags.4. Collect incorrect results and improve the model catalog.5. Plan runtime optimization.