Discogs Marketplace Price Scraper avatar

Discogs Marketplace Price Scraper

Pricing

Pay per event

Go to Apify Store
Discogs Marketplace Price Scraper

Discogs Marketplace Price Scraper

Scrape Discogs marketplace listings (asking prices, seller, condition, ships-from) plus public marketplace stats for any release ID or search query. Public API + HTML. No login.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Discogs Marketplace Price Scraper

Discogs Marketplace Price Scraper

We do the dirty work so your dataset stays clean. 😈

$5.05 / 1,000 rows — Scrape Discogs marketplace listings (asking prices, seller, condition, ships-from) plus the per-release marketplace stats aggregate (lowest_price, num_for_sale, blocked_from_sale) for any Discogs release ID or free-text search query. Public Discogs API + public marketplace HTML. No login. No Discogs API key. No browser automation.

Discogs is the world's largest catalog of music releases and the secondary market for vinyl, CD, cassette, and other physical formats — yet its public API exposes only lowest_price and num_for_sale aggregates for any given release. The per-listing data (who is selling, at what price, with what media + sleeve grade, shipping from where) lives on Cloudflare-protected HTML pages and is invisible to any standard REST consumer. This Actor closes that gap: it joins the Discogs release-metadata REST endpoint, the marketplace-stats REST endpoint, and the public marketplace-listings HTML (paginated, 25 listings per page) into a single flat dataset with one row per listing and one optional aggregate row per release.

🎯 What this scrapes

Two row flavours share one flat schema. The row_type discriminator selects which fields are populated — listing rows carry per-seller pricing + condition + ratings; stats rows carry the per-release marketplace aggregate.

FieldTypePopulated for
row_type"listing" | "stats"both
release_idintegerboth
release_title, artist, year, country, format_name, format_descriptions, genres, master_id, release_urlfrom /releases/{id}both
listing_id, listing_urlinteger / stringlisting rows
asking_price, asking_currencyfloat / ISO 4217listing rows
shipping_textstring (seller free-form)listing rows
condition_media, condition_sleeveDiscogs grade vocablisting rows
seller_username, seller_rating_pct, seller_rating_count, seller_countrystrings + numericslisting rows
stats_lowest_price, stats_lowest_currency, stats_num_for_sale, stats_blocked_from_salefrom /marketplace/stats/{id}stats rows
scraped_atISO 8601 UTCboth

🔥 Features

  • Per-listing asking prices + sellers + condition + ships-from for every release you supply — the data Discogs does not expose in its REST API.
  • Per-release marketplace stats row (lowest asking price + total listings count + blocked-from-sale flag), opt-in by default.
  • Input by release ID (deterministic, fast) or by free-text search query (resolves the top N matches via the public Discogs search endpoint, no auth required).
  • Single flat schema with a row_type discriminator — easy to join, easy to aggregate downstream by release_id or master_id.
  • Pydantic v2 input validation — XOR between releaseIds and searchQuery enforced before any network call.
  • Cloudflare bypass via curl-cffi chrome131 impersonation + one-shot Discogs homepage warm-up. Verified clean pass on 2026-05-16; no JS challenge, no Camoufox needed.
  • Discogs-compliant User-Agent (DevilScrapes/0.1 (+https://apify.com/DevilScrapes)) on the REST API surface — Discogs rejects default UAs with 403.
  • Conservative rate-limit pacing at one request every 1.5 seconds (~40 req/min) — under Discogs' 60 req/min API limit and the unwritten 25 req/min HTML ceiling.
  • Exponential backoff with Retry-After honoured for 408 / 429 / 503 responses; max 5 attempts.
  • Apify Proxy (BUYPROXIES94952 group) on by default — Cloudflare 403s un-proxied Apify datacenter IPs on the Discogs marketplace HTML surface. Turn off only when running locally from a residential ISP.
  • Per-release fail isolation — one bad release ID logs a WARNING and the run continues to the next.

💡 Use cases

  • Vinyl / CD reseller benchmarking — pull every listing for releases in your inventory and benchmark your asking prices against the live competing supply (median, min, max, condition mix).
  • Music-collectibles arbitrage — monitor cross-country shipping spreads (seller_country + asking_price + currency) for the same release; spot regional under-pricing.
  • Catalog / label market intel — for a label's catalog of release IDs, track num_for_sale and lowest_price over time to see which titles are appreciating.
  • Journalism / pricing studies — "the cheapest copy of Nevermind on Discogs right now" generators, automated with one Actor run per article.
  • Marketplace health monitoring — count stats_blocked_from_sale=true rows across a watch-list to flag releases Discogs has quietly removed from sale.
  • Seller-quality screening — filter listings by seller_rating_pct >= 99.0 and seller_rating_count >= 100 for a curated high-trust subset.

⚙️ How to use it

  1. Open the Actor input form on the Apify Console.
  2. Provide either releaseIds (recommended — direct, no ambiguity) or searchQuery — not both. Find a release ID in any Discogs URL: discogs.com/release/249504-Rick-Astley-...249504.
  3. If using searchQuery, set maxSearchResults to cap how many top hits to fetch (default 5, max 50).
  4. Set maxPagesPerRelease (default 4 = 100 listings) and maxListingsPerRelease (default 100, max 500) to cap the per-release listing volume. The lower of the two wins.
  5. Leave includeStatsRow on (default) to also get one aggregate row_type="stats" row per release. Turn off if you only want per-listing rows.
  6. Leave useProxy on (default) — Cloudflare 403s un-proxied Apify datacenter IPs on the Discogs marketplace surface. Turn off only if you're running the Actor locally from a residential ISP.
  7. Click Start. Results stream into the default dataset as JSON / CSV / Excel / XML.

Single release, first page

{
"releaseIds": [249504],
"maxPagesPerRelease": 1,
"maxListingsPerRelease": 25,
"includeStatsRow": true,
"useProxy": true
}

Search-driven, top 5 results, two pages each

{
"searchQuery": "nirvana nevermind",
"maxSearchResults": 5,
"maxPagesPerRelease": 2,
"maxListingsPerRelease": 50,
"includeStatsRow": true
}

📥 Input

FieldTypeRequiredDefaultDescription
releaseIdsinteger[]XORList of Discogs release IDs (1-100). XOR with searchQuery.
searchQuerystringXORFree-text Discogs search; top N results become release IDs. XOR with releaseIds.
maxSearchResultsintegerno5Cap on results from searchQuery (1-50).
maxPagesPerReleaseintegerno4Cap on listings pages per release (1-20; 25 listings per page).
maxListingsPerReleaseintegerno100Hard cap on listing rows per release (1-500).
includeStatsRowbooleannotrueEmit one extra row_type="stats" row per release.
useProxybooleannotrueRoute through Apify Proxy (BUYPROXIES94952). Default ON — Cloudflare 403s un-proxied datacenter IPs.

Exactly one of releaseIds or searchQuery must be provided. Passing both, or neither, raises a validation error before any network call.

📤 Output

One row per marketplace listing (and optionally one extra stats row per release), pushed to the default dataset and available as JSON, CSV, Excel, or XML.

{
"row_type": "listing",
"release_id": 249504,
"release_title": "Never Gonna Give You Up",
"artist": "Rick Astley",
"year": 1987,
"country": "UK",
"format_name": "Vinyl",
"format_descriptions": ["7\"", "45 RPM", "Single", "Stereo"],
"genres": ["Electronic", "Pop"],
"master_id": 96559,
"release_url": "https://www.discogs.com/release/249504",
"listing_id": 3761251765,
"listing_url": "https://www.discogs.com/sell/item/3761251765",
"asking_price": 0.5,
"asking_currency": "GBP",
"shipping_text": "+£15.00",
"condition_media": "Very Good Plus (VG+)",
"condition_sleeve": "Generic",
"seller_username": "Ronan266",
"seller_rating_pct": 100.0,
"seller_rating_count": 35,
"seller_country": "United Kingdom",
"stats_lowest_price": null,
"stats_lowest_currency": null,
"stats_num_for_sale": null,
"stats_blocked_from_sale": null,
"scraped_at": "2026-05-16T12:00:00.000Z"
}

Export formats

  • JSON — full fidelity, all 27 fields, newline-delimited
  • CSV — flat, one row per listing or stats record
  • Excel.xlsx via the Apify dataset converter
  • XML — structured per-item

All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.

💰 Pricing

Pay-Per-Event (PPE) — you pay only for what you use:

EventPrice (USD)When
actor-start$0.05Once per run, at boot
result-row$0.005Per listing OR per stats row written

Example costs

RunRowsCost
1 release × 25 listings + 1 stats row26$0.18
5 releases × 100 listings + 5 stats rows505$2.58
10 releases × 100 listings + 10 stats rows1,010$5.10
50 releases × 100 listings + 50 stats rows5,050$25.30

At scale the per-row charge dominates: ~$5.05 per 1,000 rows. Pricing reflects the high commercial value of hand-parsed marketplace data (asking price, condition grade, seller country, seller rating) versus a pure API field copy.

🚧 Limitations

  • No closed-sale price history. Discogs hosts sold-price stats at /sell/history/{release_id} but the page is gated behind Discogs account login (Auth0). Without user OAuth credentials it is unscrapable — out of scope for this Actor. What this Actor does instead: per-listing asking prices (the live offer side) plus the public lowest_price aggregate (the floor of the ask side).
  • Public Discogs surfaces only. No authenticated Discogs API calls, no personal-token usage, no Discogs OAuth.
  • One snapshot per run. Schedule recurring runs via Apify Schedules for time-series tracking; nothing in this Actor persists across runs.
  • Listing pages 25 items per page — Discogs imposes this; combined with maxPagesPerRelease: 20 (cap) gives a 500-listing hard ceiling per release per run.
  • Currency is not normalised. Discogs serves prices in the seller's local currency for listings and in the request IP's currency for the stats API. Downstream consumers join by the currency field rather than expecting a single canonical USD.
  • Pacing: ~40 req/min throughput. A 10-release run with 4 pages each takes roughly (10 × (1 + 1 + 4)) × 1.5s ≈ 90 seconds plus warm-up.
  • 7-day default storage retention on the Apify FREE tier. Export your dataset immediately after the run or upgrade for longer retention.

❓ FAQ

Why "Marketplace Price" and not "Sold Price"?

Discogs publishes two distinct price signals: (1) active asking prices on the marketplace listings page (public, scrapable), and (2) completed-sale prices at /sell/history/{release_id} (login-walled, not scrapable without user OAuth). This Actor delivers (1), plus the public lowest_price aggregate from the marketplace-stats API. For most reseller and arbitrage workflows the active asking-price distribution is the more actionable signal — it tells you what the market is asking right now, not what it cleared at six months ago.

Why do I need a User-Agent header for Discogs?

Discogs enforces a written policy (Discogs API Terms) requiring every API request to carry an Application-Name/Version User-Agent. Default curl-cffi headers (which impersonate a browser) work for the marketplace HTML surface but get rejected with 403 on the REST API. The Actor sends DevilScrapes/0.1 (+https://apify.com/DevilScrapes) on every API call automatically — you don't need to configure anything.

What does is_unavailable equivalence look like here?

There is no is_unavailable field — the analog is stats_blocked_from_sale=true (Discogs has flagged the release as un-sellable, e.g. legal/ToS reason) combined with stats_num_for_sale=0 (zero active listings). If you want to know "is this release scarce", the canonical query is: WHERE stats_num_for_sale < 5.

Can I scrape my own private wantlist or collection?

No. This Actor scrapes only public marketplace data. Private user data (wantlist, collection, friends, messages) requires Discogs OAuth, which is intentionally out of scope.

Can I fetch more than 100 release IDs in one run?

No. The Pydantic input model caps releaseIds at 100 (and maxListingsPerRelease at 500, maxPagesPerRelease at 20). A single run thus emits at most 100 × 500 + 100 = 50,100 rows. Split larger workloads across multiple runs and concatenate the datasets.

Why curl-cffi for a public website?

Discogs marketplace pages are protected by Cloudflare. Plain httpx / requests get a 403 + JS challenge instantly. curl-cffi chrome131 impersonation replays a real browser's TLS + HTTP/2 fingerprint and passes Cloudflare cleanly after a one-shot homepage warm-up. This is the DevilScrapes house default per ADR-0002.

Part of the Devil Scrapes Niche Marketplace Intel suite:

  • Steam Regional Price Scraper — multi-region Steam game prices with USD equivalent.
  • Reverb Sold Listings Scraper (in development) — sold listings on Reverb for music-gear arbitrage.

All three Actors share consistent pricing event names (actor-start, result-row) and field-naming conventions (snake_case) so cross-marketplace arbitrage analyses can join cleanly.

💬 Your feedback

Found a bug, hit a rate limit, or need a new field on the output row (median asking price? seller-country histogram? condition-grade distribution)? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.