expressleasing-machine-search avatar

expressleasing-machine-search

Under maintenance

Pricing

from $0.00005 / actor start

Go to Apify Store
expressleasing-machine-search

expressleasing-machine-search

Under maintenance

Crawler for construction machinery using Apify, Airtable and OpenAI API, which returns most relevant offer links from WHITELISTED suppliers, and parameters like brand, model, price, year etc. Airtable serves as a search + for gathering results. Crawler was built in Apify (Javascript + Playground)

Pricing

from $0.00005 / actor start

Rating

0.0

(0)

Developer

DARIUSZ SUCHY

DARIUSZ SUCHY

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

a day ago

Last modified

Share

ExpressLeasing Machine Search Actor

Project purpose

This Actor automates the machine search process for ExpressLeasing client requests.

Business flow:

  1. The user enters a search phrase in Airtable, for example:

    • Kubota KX027 up to 120,000 PLN net
    • kubota kx016 price from 80000 to 120000 pln
    • mini excavator 2.5-3 tons up to 130,000 PLN net
    • JCB 3CX up to 180,000 PLN net
    • Manitou telehandler
  2. Airtable triggers the Apify Actor.

  3. The Actor reads the supplier list and inventory URLs from Airtable.

  4. The Actor searches supplier inventory pages using Playwright.

  5. The Actor identifies candidate listings, scores them, and writes the results back to Airtable.

  6. Optionally, the Actor uses OpenAI to rerank the best candidates.

  7. Final results appear in the Wyniki wyszukiwania table in Airtable.

The MVP goal is to reduce manual browsing across many supplier websites and return the best matching offers and alternatives in one place.


Current MVP scope

The Actor currently supports:

  • reading a search request from Airtable,

  • reading suppliers from Airtable,

  • cleaning supplier inventory URLs,

  • skipping contact-page URLs,

  • searching inventory pages with Playwright,

  • detecting likely offer links,

  • opening offer detail pages,

  • extracting basic offer data:

    • title,
    • offer URL,
    • supplier,
    • net price,
    • brand,
    • model,
    • production year,
    • operating hours / mth,
    • machine weight in kg,
  • rule-based scoring,

  • result classification:

    • Dokładne,
    • Alternatywa,
    • Słabe,
  • supplier filtering by tags,

  • optional OpenAI reranking,

  • writing results back to Airtable.


Architecture

Airtable
├── Wyszukiwania maszyn
├── Dostawcy
└── Wyniki wyszukiwania
Airtable Automation
Apify Actor
Playwright / Chromium
Supplier websites / Otomoto / OLX / Machineryline / inventory pages
Rule-based scoring
Optional OpenAI reranking
Airtable: Wyniki wyszukiwania

Main Airtable tables

1. Wyszukiwania maszyn

This is the table where the user enters search requests.

Required fields:

FieldTypeDescription
NazwaText / formulaSearch record name
Record IDFormulaRECORD_ID()
Fraza wyszukiwaniaLong textClient search phrase
StatusSingle selectRobocze, Do wyszukania, W trakcie, Gotowe, Błąd
Limit źrodeł / Limit źródełNumberNumber of supplier sources to check
Limit wynikówNumberNumber of final results to save
Max stron na źródłoNumberNumber of inventory pages to scan per source
Parametry AILong textRecognized query parameters and supplier filter data
PodsumowanieLong textRun summary
Liczba wynikówNumberNumber of saved results
Run IDTextApify run ID
Data startuDate/timeRun start time
Data zakończeniaDate/timeRun end time
BłądLong textError message

Setting Status = Do wyszukania starts the search automation.


2. Dostawcy

This table contains suppliers and their inventory links.

Required fields:

FieldTypeDescription
NameTextSupplier name
Kod dostawcyTextSupplier code
ProduktyLinked records / textRelated products or notes
Link do dostawcy 1URLSupplier inventory URL
Tagi wyszukiwaniaMultiple selectSupplier category and brand tags

The field Link do dostawcy 1 should point to an inventory or offer-listing page, not to a contact page.

Good examples:

https://example.otomoto.pl/inventory
https://example.pl/maszyny-uzywane
https://www.olx.pl/oferty/uzytkownik/...

Weak examples:

https://example.pl/kontakt
https://example.pl/o-nas

3. Wyniki wyszukiwania

This table stores the results returned by the Actor.

Required fields:

FieldTypeDescription
Tytuł ofertyTextOffer title
WyszukiwanieLinked recordLink to the search record
ScoreNumberRule-based score
Typ wynikuSingle selectDokładne, Alternatywa, Słabe
Score AINumberOpenAI score
Typ wyniku AISingle selectDokładne, Alternatywa, Słabe, Odrzucone
Powód AILong textOpenAI reasoning
OpenAI użyteCheckboxWhether OpenAI was used for this result
DostawcaLinked recordLink to supplier
Link do ofertyURLDirect offer URL
Cena nettoCurrency / numberNet price, if detected
MarkaTextDetected brand
ModelTextDetected model
Rok produkcjiNumberDetected production year
Przebieg / mthNumberOperating hours
Masa kgNumberMachine weight in kg
Powód dopasowaniaLong textRule-based matching reason
Braki danychLong textMissing data
Status wynikuSingle selectDo sprawdzenia, Dobry, Odrzucony
Data znalezieniaDate/timeResult creation time

Supplier tags

The field:

Tagi wyszukiwania

is used to reduce the number of supplier sources that must be searched.

Example category tags:

#minikoparki
#koparki
#koparki_kolowe
#koparki_gasienicowe
#koparko_ladowarki
#ladowarki
#ladowarki_teleskopowe
#ladowarki_kolowe
#mini_ladowarki
#wozki_widlowe
#podnosniki
#walce
#wozidla
#wywrotki
#zurawie_dekarskie
#rolnicze
#mix

Example brand tags:

#kubota
#jcb
#takeuchi
#yanmar
#bobcat
#caterpillar
#komatsu
#volvo
#manitou
#merlo
#liebherr
#wacker_neuson
#sany
#hitachi
#doosan
#develon
#case
#new_holland
#mecalac
#kramer
#hamm
#bomag
#ammann
#dynapac
#toyota
#linde
#still
#unicarriers

Rules:

  • tags describe the supplier’s actual specialization,
  • not every supplier needs to be tagged immediately,
  • broad suppliers can be tagged with #mix,
  • if there are too few matching tags, the Actor falls back to searching a broader supplier set.

Actor input

The Actor requires one input value:

{
"searchRecordId": "recXXXXXXXXXXXXXX"
}

searchRecordId is the technical Airtable record ID from the Wyszukiwania maszyn table.

It can be exposed in Airtable using a formula field:

RECORD_ID()

Environment variables

Required

NameExampleSecretDescription
AIRTABLE_PATpat...YesAirtable Personal Access Token
AIRTABLE_BASE_IDapp...NoAirtable base ID
SEARCHES_TABLEWyszukiwania maszynNoSearch table name
SUPPLIERS_TABLEDostawcyNoSupplier table name
RESULTS_TABLEWyniki wyszukiwaniaNoResults table name

Airtable field names

NameDefault valueDescription
SUPPLIER_NAME_FIELDNameSupplier name field
SUPPLIER_CODE_FIELDKod dostawcySupplier code field
INVENTORY_URL_FIELDLink do dostawcy 1Supplier inventory URL field
SUPPLIER_TAGS_FIELDTagi wyszukiwaniaSupplier tags field
SEARCH_QUERY_FIELDFraza wyszukiwaniaSearch phrase field

Runtime limits

NameDefault valueDescription
MIN_TAGGED_SOURCES12Minimum number of sources after tag filtering
MAX_DETAILS_PER_SOURCE8Maximum number of offer details opened per supplier source

Recommended test setting:

MAX_DETAILS_PER_SOURCE = 4 or 5

Lower values reduce runtime and cost.

OpenAI

NameExampleSecretDescription
OPENAI_API_KEYsk-...YesOpenAI API key
USE_OPENAI_RERANKtrueNoEnables OpenAI reranking
OPENAI_MODELgpt-5.4-miniNoModel used for reranking
OPENAI_RERANK_LIMIT20NoNumber of candidates sent to OpenAI
OPENAI_FINAL_RESULTS5NoFinal number of results after reranking

If OpenAI fails, the Actor should still complete successfully and fall back to the rule-based ranking.


From Airtable

  1. Open the Wyszukiwania maszyn table.
  2. Create a new search record.
  3. Enter Fraza wyszukiwania, for example:
kubota kx016 cena od 80000 do 120000 pln
  1. Set search parameters:
Limit źrodeł = 30
Limit wyników = 20
Max stron na źródło = 1
  1. Set:
Status = Do wyszukania
  1. Airtable Automation triggers the Apify Actor.
  2. When the Actor finishes, it updates:
Status = Gotowe
Liczba wyników = number of saved results
Podsumowanie = run summary
  1. Results appear in the Wyniki wyszukiwania table.

Quick test

Limit źrodeł = 10
Limit wyników = 5
Max stron na źródło = 1
MAX_DETAILS_PER_SOURCE = 4
OPENAI_RERANK_LIMIT = 10

MVP test

Limit źrodeł = 30
Limit wyników = 10-20
Max stron na źródło = 1
MAX_DETAILS_PER_SOURCE = 4-5
OPENAI_RERANK_LIMIT = 15-20

Broader test

Limit źrodeł = 50
Limit wyników = 5-20
Max stron na źródło = 1
MAX_DETAILS_PER_SOURCE = 4-5
OPENAI_RERANK_LIMIT = 20

Do not increase Max stron na źródło to 2 until the first-page results are reliable.


How scoring works

The Actor uses two scoring layers.

1. Rule-based scoring

The rule-based scoring evaluates:

  • exact model match,
  • brand match,
  • machine category,
  • weight class,
  • budget,
  • production year,
  • operating hours,
  • title and description similarity,
  • known model catalog.

Example:

Kubota KX016-4

for the query:

kubota kx016 cena od 80000 do 120000 pln

should be classified as Dokładne.

Possible alternatives:

Kubota KX018-4
Kubota KX019-4
JCB 8018
Takeuchi TB216
Yanmar SV17

may be classified as Alternatywa if they are in a similar weight class.

2. OpenAI reranking

If USE_OPENAI_RERANK = true, the Actor sends the best candidates to OpenAI.

OpenAI evaluates:

  • whether the offer is in the correct machine category,
  • whether it is the exact model,
  • whether it is a reasonable alternative,
  • whether the result is clearly from the wrong category,
  • how to justify the match.

OpenAI results are written to:

Score AI
Typ wyniku AI
Powód AI
OpenAI użyte

Supported categories

The rule-based logic recognizes, among others:

minikoparki
midikoparki
koparko_ladowarki
koparki_kolowe
koparki_gasienicowe
ladowarki_teleskopowe
ladowarki_kolowe
mini_ladowarki
wozki_widlowe
podnosniki
walce
wozidla
wywrotki
zurawie_dekarskie
rolnicze
kruszarki_przesiewacze

Example test queries

kubota kx016 cena od 80000 do 120000 pln
Kubota KX027 do 120 000 zł netto
minikoparka 2,5-3 tony do 130 000 zł netto
JCB 3CX do 180 000 zł netto
ładowarka teleskopowa JCB 540-140
ładowarka teleskopowa Manitou 14m
koparka kołowa 15 ton
walec 2-3 tony
wózek widłowy LPG 2,5 tony

Expected Airtable output

Example result table:

ScoreTyp wynikuScore AITyp wyniku AIOfferSupplierURL
94Dokładne96DokładneKubota KX016-4Supplier XURL
78Alternatywa84AlternatywaJCB 8018Supplier YURL
72Alternatywa81AlternatywaTakeuchi TB216Supplier ZURL

Known limitations

  1. The Actor does not use official marketplace APIs.
  2. The crawler works through Playwright, so runtime depends on website speed.
  3. Some websites may block automation, load offers dynamically, or change their HTML.
  4. Price, year, weight, and operating hours may not always be extracted correctly.
  5. Cena netto may be empty if the price is not available in a readable format.
  6. OpenAI reranking improves result quality but does not reduce scraping time.
  7. Supplier tags improve speed and relevance only after they are filled in.
  8. Rerunning the same search record may create duplicate results.

Performance and cost

Observed benchmark from an earlier version:

50 sources
1 page per source
5 results
around 33 minutes
around 0.45 USD

Main runtime drivers:

  • number of sources,
  • Max stron na źródło,
  • MAX_DETAILS_PER_SOURCE,
  • supplier website speed,
  • number of offer detail pages opened,
  • whether OpenAI reranking is enabled.

The simplest optimization:

MAX_DETAILS_PER_SOURCE = 4 or 5

Larger future optimization:

global candidate ranking before opening offer detail pages

Troubleshooting

Brak searchRecordId w input Actora

The Actor did not receive the Airtable search record ID.

Check the input:

{
"searchRecordId": "recXXXXXXXXXXXXXX"
}

Also check the Airtable Automation input variable recordId.


Brak AIRTABLE_PAT

The required environment variable is missing:

AIRTABLE_PAT

Add it in Apify as a secret.


Airtable API error 401 / 403

This usually means an Airtable token or permission issue.

Check:

AIRTABLE_PAT
AIRTABLE_BASE_ID
token access to the base
read/write permissions

Unknown field name

Airtable field names do not match the code.

Check especially:

Fraza wyszukiwania
Link do dostawcy 1
Tagi wyszukiwania
Wyniki wyszukiwania
Score AI
Typ wyniku AI
Powód AI
OpenAI użyte

OpenAI reranking failed

The Actor should fall back to rule-based ranking.

Check:

OPENAI_API_KEY
OPENAI_MODEL
USE_OPENAI_RERANK
model availability
API limits

If the model is not available, change:

OPENAI_MODEL

to a model available in the API project.


Results are from the wrong category

Example: a wheel loader appears for a mini excavator query.

Possible causes:

  • missing model alias,
  • ambiguous offer title,
  • OpenAI reranking disabled,
  • too little data on the offer page,
  • supplier has no tags.

Actions:

  1. Check Typ wyniku AI.
  2. Check Powód AI.
  3. Check Powód dopasowania.
  4. Add missing models or aliases to the catalog.
  5. Add supplier tags.

Too few results

Actions:

  1. Increase Limit źrodeł.
  2. Increase MAX_DETAILS_PER_SOURCE.
  3. Increase OPENAI_RERANK_LIMIT.
  4. Try Max stron na źródło = 2, but only as a test.
  5. Check whether supplier tag filtering is too narrow.

Run takes too long

Actions:

  1. Reduce MAX_DETAILS_PER_SOURCE to 4.
  2. Use supplier tags.
  3. Reduce Limit źrodeł.
  4. Avoid increasing Max stron na źródło unless necessary.
  5. Future improvement: implement global candidate ranking before opening detail pages.

Operational best practices

  1. Create a new Wyszukiwania maszyn record for each test.

  2. Do not rerun the same record repeatedly without clearing old results.

  3. Start tests with Limit źrodeł = 10-15.

  4. Run broader tests only after the smaller tests succeed.

  5. Manually classify results as:

    • Dobry,
    • Do sprawdzenia,
    • Odrzucony.
  6. Review incorrect results and improve the model catalog or supplier tags.


Roadmap

Stage 1 — current MVP

  • live search across known sources,
  • rule-based scoring,
  • supplier tags,
  • OpenAI reranking,
  • results in Airtable.

Stage 2 — runtime optimization

  • global candidate ranking,
  • opening detail pages only for the best candidates,
  • limited concurrency,
  • lower run cost.

Stage 3 — better query interpretation

  • OpenAI query parser,
  • automatic category, brand, model, budget, weight, year, and hours extraction,
  • automatic alternative model expansion.

Stage 4 — offer cache / offer index

  • scheduled inventory crawling,
  • local Znalezione oferty table,
  • fast search over already collected offers,
  • live search only as fallback.

Stage 5 — result handling automation

  • mark result as Dobry,
  • create supplier contact task,
  • prepare supplier availability message,
  • prepare a client proposal draft.

Project status

The project is currently in a usable MVP stage.

The current scope satisfies the target flow:

The user enters a search phrase in Airtable.
The system searches supplier sources.
After several to several dozen minutes, the user receives top offers.
Most top results are from the correct category.
Exact models and close alternatives rank above random machines.

Current priorities:

1. Test OpenAI reranking across several machine categories.
2. Set MAX_DETAILS_PER_SOURCE to 4 or 5.
3. Fill supplier tags.
4. Collect incorrect results and improve the model catalog.
5. Plan runtime optimization.