Discogs Marketplace Price Scraper
Pricing
Pay per event
Discogs Marketplace Price Scraper
Scrape Discogs marketplace listings (asking prices, seller, condition, ships-from) plus public marketplace stats for any release ID or search query. Public API + HTML. No login.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Discogs Marketplace Price Scraper
We do the dirty work so your dataset stays clean. 😈
$5.05 / 1,000 rows — Scrape Discogs marketplace listings (asking prices, seller, condition, ships-from) plus the per-release marketplace stats aggregate (lowest_price, num_for_sale, blocked_from_sale) for any Discogs release ID or free-text search query. Public Discogs API + public marketplace HTML. No login. No Discogs API key. No browser automation.
Discogs is the world's largest catalog of music releases and the secondary market for vinyl, CD, cassette, and other physical formats — yet its public API exposes only lowest_price and num_for_sale aggregates for any given release. The per-listing data (who is selling, at what price, with what media + sleeve grade, shipping from where) lives on Cloudflare-protected HTML pages and is invisible to any standard REST consumer. This Actor closes that gap: it joins the Discogs release-metadata REST endpoint, the marketplace-stats REST endpoint, and the public marketplace-listings HTML (paginated, 25 listings per page) into a single flat dataset with one row per listing and one optional aggregate row per release.
🎯 What this scrapes
Two row flavours share one flat schema. The row_type discriminator selects which fields are populated — listing rows carry per-seller pricing + condition + ratings; stats rows carry the per-release marketplace aggregate.
| Field | Type | Populated for |
|---|---|---|
row_type | "listing" | "stats" | both |
release_id | integer | both |
release_title, artist, year, country, format_name, format_descriptions, genres, master_id, release_url | from /releases/{id} | both |
listing_id, listing_url | integer / string | listing rows |
asking_price, asking_currency | float / ISO 4217 | listing rows |
shipping_text | string (seller free-form) | listing rows |
condition_media, condition_sleeve | Discogs grade vocab | listing rows |
seller_username, seller_rating_pct, seller_rating_count, seller_country | strings + numerics | listing rows |
stats_lowest_price, stats_lowest_currency, stats_num_for_sale, stats_blocked_from_sale | from /marketplace/stats/{id} | stats rows |
scraped_at | ISO 8601 UTC | both |
🔥 Features
- Per-listing asking prices + sellers + condition + ships-from for every release you supply — the data Discogs does not expose in its REST API.
- Per-release marketplace stats row (lowest asking price + total listings count + blocked-from-sale flag), opt-in by default.
- Input by release ID (deterministic, fast) or by free-text search query (resolves the top N matches via the public Discogs search endpoint, no auth required).
- Single flat schema with a
row_typediscriminator — easy to join, easy to aggregate downstream by release_id or master_id. - Pydantic v2 input validation — XOR between
releaseIdsandsearchQueryenforced before any network call. - Cloudflare bypass via
curl-cffichrome131 impersonation + one-shot Discogs homepage warm-up. Verified clean pass on 2026-05-16; no JS challenge, no Camoufox needed. - Discogs-compliant
User-Agent(DevilScrapes/0.1 (+https://apify.com/DevilScrapes)) on the REST API surface — Discogs rejects default UAs with 403. - Conservative rate-limit pacing at one request every 1.5 seconds (~40 req/min) — under Discogs' 60 req/min API limit and the unwritten 25 req/min HTML ceiling.
- Exponential backoff with
Retry-Afterhonoured for408 / 429 / 503responses; max 5 attempts. - Apify Proxy (
BUYPROXIES94952group) on by default — Cloudflare 403s un-proxied Apify datacenter IPs on the Discogs marketplace HTML surface. Turn off only when running locally from a residential ISP. - Per-release fail isolation — one bad release ID logs a WARNING and the run continues to the next.
💡 Use cases
- Vinyl / CD reseller benchmarking — pull every listing for releases in your inventory and benchmark your asking prices against the live competing supply (median, min, max, condition mix).
- Music-collectibles arbitrage — monitor cross-country shipping spreads (seller_country + asking_price + currency) for the same release; spot regional under-pricing.
- Catalog / label market intel — for a label's catalog of release IDs, track
num_for_saleandlowest_priceover time to see which titles are appreciating. - Journalism / pricing studies — "the cheapest copy of Nevermind on Discogs right now" generators, automated with one Actor run per article.
- Marketplace health monitoring — count
stats_blocked_from_sale=truerows across a watch-list to flag releases Discogs has quietly removed from sale. - Seller-quality screening — filter listings by
seller_rating_pct >= 99.0andseller_rating_count >= 100for a curated high-trust subset.
⚙️ How to use it
- Open the Actor input form on the Apify Console.
- Provide either
releaseIds(recommended — direct, no ambiguity) orsearchQuery— not both. Find a release ID in any Discogs URL:discogs.com/release/249504-Rick-Astley-...→249504. - If using
searchQuery, setmaxSearchResultsto cap how many top hits to fetch (default 5, max 50). - Set
maxPagesPerRelease(default 4 = 100 listings) andmaxListingsPerRelease(default 100, max 500) to cap the per-release listing volume. The lower of the two wins. - Leave
includeStatsRowon (default) to also get one aggregaterow_type="stats"row per release. Turn off if you only want per-listing rows. - Leave
useProxyon (default) — Cloudflare 403s un-proxied Apify datacenter IPs on the Discogs marketplace surface. Turn off only if you're running the Actor locally from a residential ISP. - Click Start. Results stream into the default dataset as JSON / CSV / Excel / XML.
Single release, first page
{"releaseIds": [249504],"maxPagesPerRelease": 1,"maxListingsPerRelease": 25,"includeStatsRow": true,"useProxy": true}
Search-driven, top 5 results, two pages each
{"searchQuery": "nirvana nevermind","maxSearchResults": 5,"maxPagesPerRelease": 2,"maxListingsPerRelease": 50,"includeStatsRow": true}
📥 Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
releaseIds | integer[] | XOR | — | List of Discogs release IDs (1-100). XOR with searchQuery. |
searchQuery | string | XOR | — | Free-text Discogs search; top N results become release IDs. XOR with releaseIds. |
maxSearchResults | integer | no | 5 | Cap on results from searchQuery (1-50). |
maxPagesPerRelease | integer | no | 4 | Cap on listings pages per release (1-20; 25 listings per page). |
maxListingsPerRelease | integer | no | 100 | Hard cap on listing rows per release (1-500). |
includeStatsRow | boolean | no | true | Emit one extra row_type="stats" row per release. |
useProxy | boolean | no | true | Route through Apify Proxy (BUYPROXIES94952). Default ON — Cloudflare 403s un-proxied datacenter IPs. |
Exactly one of releaseIds or searchQuery must be provided. Passing both, or neither, raises a validation error before any network call.
📤 Output
One row per marketplace listing (and optionally one extra stats row per release), pushed to the default dataset and available as JSON, CSV, Excel, or XML.
{"row_type": "listing","release_id": 249504,"release_title": "Never Gonna Give You Up","artist": "Rick Astley","year": 1987,"country": "UK","format_name": "Vinyl","format_descriptions": ["7\"", "45 RPM", "Single", "Stereo"],"genres": ["Electronic", "Pop"],"master_id": 96559,"release_url": "https://www.discogs.com/release/249504","listing_id": 3761251765,"listing_url": "https://www.discogs.com/sell/item/3761251765","asking_price": 0.5,"asking_currency": "GBP","shipping_text": "+£15.00","condition_media": "Very Good Plus (VG+)","condition_sleeve": "Generic","seller_username": "Ronan266","seller_rating_pct": 100.0,"seller_rating_count": 35,"seller_country": "United Kingdom","stats_lowest_price": null,"stats_lowest_currency": null,"stats_num_for_sale": null,"stats_blocked_from_sale": null,"scraped_at": "2026-05-16T12:00:00.000Z"}
Export formats
- JSON — full fidelity, all 27 fields, newline-delimited
- CSV — flat, one row per listing or stats record
- Excel —
.xlsxvia the Apify dataset converter - XML — structured per-item
All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.
💰 Pricing
Pay-Per-Event (PPE) — you pay only for what you use:
| Event | Price (USD) | When |
|---|---|---|
actor-start | $0.05 | Once per run, at boot |
result-row | $0.005 | Per listing OR per stats row written |
Example costs
| Run | Rows | Cost |
|---|---|---|
| 1 release × 25 listings + 1 stats row | 26 | $0.18 |
| 5 releases × 100 listings + 5 stats rows | 505 | $2.58 |
| 10 releases × 100 listings + 10 stats rows | 1,010 | $5.10 |
| 50 releases × 100 listings + 50 stats rows | 5,050 | $25.30 |
At scale the per-row charge dominates: ~$5.05 per 1,000 rows. Pricing reflects the high commercial value of hand-parsed marketplace data (asking price, condition grade, seller country, seller rating) versus a pure API field copy.
🚧 Limitations
- No closed-sale price history. Discogs hosts sold-price stats at
/sell/history/{release_id}but the page is gated behind Discogs account login (Auth0). Without user OAuth credentials it is unscrapable — out of scope for this Actor. What this Actor does instead: per-listing asking prices (the live offer side) plus the publiclowest_priceaggregate (the floor of the ask side). - Public Discogs surfaces only. No authenticated Discogs API calls, no personal-token usage, no Discogs OAuth.
- One snapshot per run. Schedule recurring runs via Apify Schedules for time-series tracking; nothing in this Actor persists across runs.
- Listing pages 25 items per page — Discogs imposes this; combined with
maxPagesPerRelease: 20(cap) gives a 500-listing hard ceiling per release per run. - Currency is not normalised. Discogs serves prices in the seller's local currency for listings and in the request IP's currency for the stats API. Downstream consumers join by the
currencyfield rather than expecting a single canonical USD. - Pacing: ~40 req/min throughput. A 10-release run with 4 pages each takes roughly (10 × (1 + 1 + 4)) × 1.5s ≈ 90 seconds plus warm-up.
- 7-day default storage retention on the Apify FREE tier. Export your dataset immediately after the run or upgrade for longer retention.
❓ FAQ
Why "Marketplace Price" and not "Sold Price"?
Discogs publishes two distinct price signals: (1) active asking prices on the marketplace listings page (public, scrapable), and (2) completed-sale prices at /sell/history/{release_id} (login-walled, not scrapable without user OAuth). This Actor delivers (1), plus the public lowest_price aggregate from the marketplace-stats API. For most reseller and arbitrage workflows the active asking-price distribution is the more actionable signal — it tells you what the market is asking right now, not what it cleared at six months ago.
Why do I need a User-Agent header for Discogs?
Discogs enforces a written policy (Discogs API Terms) requiring every API request to carry an Application-Name/Version User-Agent. Default curl-cffi headers (which impersonate a browser) work for the marketplace HTML surface but get rejected with 403 on the REST API. The Actor sends DevilScrapes/0.1 (+https://apify.com/DevilScrapes) on every API call automatically — you don't need to configure anything.
What does is_unavailable equivalence look like here?
There is no is_unavailable field — the analog is stats_blocked_from_sale=true (Discogs has flagged the release as un-sellable, e.g. legal/ToS reason) combined with stats_num_for_sale=0 (zero active listings). If you want to know "is this release scarce", the canonical query is: WHERE stats_num_for_sale < 5.
Can I scrape my own private wantlist or collection?
No. This Actor scrapes only public marketplace data. Private user data (wantlist, collection, friends, messages) requires Discogs OAuth, which is intentionally out of scope.
Can I fetch more than 100 release IDs in one run?
No. The Pydantic input model caps releaseIds at 100 (and maxListingsPerRelease at 500, maxPagesPerRelease at 20). A single run thus emits at most 100 × 500 + 100 = 50,100 rows. Split larger workloads across multiple runs and concatenate the datasets.
Why curl-cffi for a public website?
Discogs marketplace pages are protected by Cloudflare. Plain httpx / requests get a 403 + JS challenge instantly. curl-cffi chrome131 impersonation replays a real browser's TLS + HTTP/2 fingerprint and passes Cloudflare cleanly after a one-shot homepage warm-up. This is the DevilScrapes house default per ADR-0002.
Related Actors
Part of the Devil Scrapes Niche Marketplace Intel suite:
- Steam Regional Price Scraper — multi-region Steam game prices with USD equivalent.
- Reverb Sold Listings Scraper (in development) — sold listings on Reverb for music-gear arbitrage.
All three Actors share consistent pricing event names (actor-start, result-row) and field-naming conventions (snake_case) so cross-marketplace arbitrage analyses can join cleanly.
💬 Your feedback
Found a bug, hit a rate limit, or need a new field on the output row (median asking price? seller-country histogram? condition-grade distribution)? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.