Doing Good Leeds Scraper avatar

Doing Good Leeds Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Doing Good Leeds Scraper

Doing Good Leeds Scraper

Scrape paid jobs, volunteering, events, and training from doinggoodleeds.org.uk via WP-JSON. Pick any subset of 4 entity types. ~178 entities total. Title, employer, location, salary, apply email/URL, full description HTML inline per row. JSON or CSV out, billed per result.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Scrape paid jobs, volunteering opportunities, events, and training courses from doinggoodleeds.org.uk — Leeds' volunteer hub. Four custom-post-type collections behind one WP-JSON REST API: pick one or many via the entityTypes input. Each row carries full structured data — title, employer, location, salary (when present), apply email/URL, and the complete description HTML. JSON or CSV out, no compute charge per run, just per result.

How it works

How Doing Good Leeds Scraper works

✨ Why use this scraper?

Tracking Leeds' voluntary-sector hiring? Mapping volunteer opportunities for placement programmes? Building a community events calendar? Cataloguing the training courses charities are funding?

  • 🎯 Four entity types in one actor. job-listings (paid jobs, ~31), volunteers (volunteering opportunities, ~23), event (events, ~34), training-course (training courses, ~90) — pick any subset.
  • WP-JSON REST API as the data source. Each entity is a WordPress custom post type with its own /wp-json/wp/v2/<cpt> endpoint.
  • 🏷️ Custom location taxonomy. WP Job Manager's _job_location meta is usually empty on Doing Good Leeds — we fall back to their custom location taxonomy (Leeds, regional Yorkshire, etc.).
  • 📧 Apply email / URL captured. _application meta is split into applyEmail vs externalApplyUrl automatically.
  • 🌟 Cloudflare-friendly. Only the passive __cf_bm cookie is enforced — any sane UA passes without proxy.
  • 📤 Clean exports. One row per entity, all fields inline. JSON + CSV exported automatically.

🎯 Use cases

TeamWhat they build
Voluntary sector recruitersDaily Leeds nonprofit hiring feeds
Volunteer co-ordinatorsOpportunity mapping for student placement programmes
Community events platformsPull events into a unified Leeds calendar
Training providersTrack what courses other charities are running / funding
Workforce strategyLeeds third-sector pay benchmarks
ResearchersLeeds civil-society datasets (jobs + volunteers + events + training)

📥 Supported inputs

URL patternBehaviour
https://doinggoodleeds.org.uk/jobs/ etc.Listings for any CPT (the actor picks based on entityTypes)
https://doinggoodleeds.org.uk/job/<slug>/Single paid job
https://doinggoodleeds.org.uk/volunteer/<slug>/Single volunteer opportunity
https://doinggoodleeds.org.uk/event/<slug>/Single event
https://doinggoodleeds.org.uk/training-course/<slug>/Single training course
https://doinggoodleeds.org.uk/wp-json/wp/v2/{job-listings|volunteers|event|training_course}WP-JSON endpoint

Leave startUrls empty + pick entityTypes to scrape every entity of those types.

Not supported: mixing entity types in a single dataset row (each row is one CPT); hosts outside doinggoodleeds.org.uk.

🔄 How it works

  1. Resolve start URLs — either from explicit startUrls, or from entityTypes (default ["job-listings"]).
  2. Classify + translate each URL into the canonical /wp-json/wp/v2/<cpt> shape — tagging it with which CPT it represents.
  3. Walk pagination via X-WP-TotalPages from the response header.
  4. Parse each WP-JSON item — title, content HTML, WP Job Manager meta (where present), _embed taxonomies (categories, types, location).
  5. Push one normalised row per entity to the dataset, tagged with the source CPT via the cpt field.

⚙️ Input parameters

ParameterTypeDefaultDescription
startUrlsarray["https://doinggoodleeds.org.uk/wp-json/wp/v2/job-listings"]Browser URLs, single-detail URLs, or WP-JSON endpoints.
entityTypesarray["job-listings"]Used when startUrls is empty. Allowed values: job-listings, volunteers, event, training-course.
enrichTaxonomiesbooleantrueWhen true, embeds taxonomy term names + featured image via WP-JSON _embed.
maxItemsinteger1000Hard cap on rows pushed (~178 total across all CPTs).
maxConcurrency / minConcurrencyinteger5 / 1Parallel WP-JSON page-fetch limits.
maxRequestRetriesinteger5Retries before a failed request is given up.
proxyobjectNo proxyCloudflare lets us through without a proxy.

📊 Output overview

Each entity is one single dataset row. The type field tells you what entity it is (job, volunteer, event, training), and the cpt field carries the raw CPT slug.

📦 Output sample

{
"type": "job",
"cpt": "job-listings",
"source": "doinggoodleeds.org.uk",
"jobId": "111130",
"slug": "young-adults-worker-3",
"jobUrl": "https://doinggoodleeds.org.uk/job/young-adults-worker-3/",
"wpJsonUrl": "https://doinggoodleeds.org.uk/wp-json/wp/v2/job-listings/111130",
"title": "Young Adults Worker",
"description": "<p>Young Adults Worker role at Waythrough…</p>",
"descriptionText": "Young Adults Worker role at Waythrough…",
"companyName": null,
"companyWebsite": null,
"companyDomain": null,
"location": "Leeds",
"locations": ["Leeds"],
"remote": false,
"salary": null,
"categories": [],
"employmentTypes": ["Full Time"],
"contractType": "Full Time",
"featured": false,
"filled": false,
"status": "publish",
"postedDate": "2026-04-23T10:01:46Z",
"modifiedDate": "2026-04-23T10:01:46Z",
"applyType": "email",
"applyUrl": "https://doinggoodleeds.org.uk/job/young-adults-worker-3/",
"applyEmail": "olivia.hodgson@waythrough.org.uk",
"externalApplyUrl": null,
"featuredImageUrl": null,
"authorId": 1,
"authorName": null,
"scrapedAt": "2026-05-20T00:13:00.000Z"
}

🗂 Key output fields

GroupFields
Identifierstype, cpt (job-listings / volunteers / event / training-course), source, jobId, slug, jobUrl, wpJsonUrl, scrapedAt
Contenttitle, description (HTML), descriptionText (plain)
DatespostedDate (ISO), modifiedDate (ISO)
EmployercompanyName (often null), companyWebsite, companyDomain, companyTagline, featuredImageUrl
Locationlocation (primary), locations[] (all taxonomy terms), remote
Compensationsalary.{currency, min, max, unit, raw} (when present in WP Job Manager meta)
Taxonomiescategories[], employmentTypes[], contractType
Flagsfeatured, filled, status
Apply flowapplyType, applyUrl, applyEmail, externalApplyUrl

❓ FAQ

Can I scrape all four entity types in one run? Yes. Set entityTypes to ["job-listings", "volunteers", "event", "training-course"]. Each row will have a cpt field indicating which collection it came from.

Why are some salaries empty? WP Job Manager's _job_salary meta isn't always populated for voluntary-sector roles. Look at the description HTML for compensation info when salary is null.

Why is companyName often null? Doing Good Leeds doesn't enforce the _company_name meta field. Org name is usually in the description or apply email domain — pull it from there if needed.

Can I scrape private pages or applicant data? No. Only the public WP-JSON REST API.

How do I limit results? Set maxItems. Run with ["training-course"] alone if you only want the 90 courses.

💬 Support

🛠 Additional services

🔎 Explore more scrapers

See other scrapers at memo23's Apify profile — covering job boards, real estate, social media, and more.


⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Doing Good Leeds, Voluntary Action Leeds (VAL), or any of their subsidiaries or affiliates. All trademarks mentioned are the property of their respective owners.

The scraper accesses only the publicly available WP-JSON REST endpoints and public detail pages on doinggoodleeds.org.uk — no authenticated endpoints, recruiter-only features, or content behind a login. Users are responsible for ensuring their use complies with doinggoodleeds.org.uk's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organisation.


SEO Keywords

doing good leeds scraper, scrape doinggoodleeds.org.uk, doing good leeds api, leeds volunteer scraper, leeds charity jobs scraper, leeds nonprofit jobs api, leeds volunteer opportunities api, leeds community events scraper, leeds training courses scraper, voluntary action leeds scraper, Apify doing good leeds, leeds third sector jobs, leeds civil society data, yorkshire volunteer hub api, wp-json scraper, wp job manager scraper, charityjob alternative scraper, vassheffield alternative scraper, barnsleycvs alternative scraper, uk cvs jobs scraper, leeds nonprofit recruitment data