Advanced Glassdoor Scraper
This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?
See alternative ActorsAdvanced Glassdoor Scraper
The most advanced Glassdoor Scraper that you would ever need. Extract millions of companies, salaries, interviews, jobs, and reviews from Glassdoor. You can specify search terms, filters, list pages, and more! Extremely fast, with no limits. Super easy to use!
Actor - Glassdoor Scraper
Glassdoor scraper
Since Glassdoor doesn't provide a good and free API, this actor should help you to retrieve data from it.
The Glassdoor data scraper supports the following features:
-
Search any keyword - You can search any keyword you would like to have and get the results. No limits!
-
Get extensive company information - Get all information about a company right away!
-
Looking for jobs? You are at the right place! - Retrieve all the job information of a company supported by filters. Extremely fast data retrieval!
-
All detailed salary information is at your fingertips! - Salary information and all the detailed pricing is at your service!
-
Company reviews! - All the extended reviews, company information, ratings, and many other stuff!
-
Interviews - Retrieve all the interviews of a company. Super simple, easy usage!
Bugs, fixes, updates, and changelog
This scraper is under active development. If you have any feature requests you can create an issue from here.
Input Parameters
The input of this scraper should be JSON containing the list of pages on Glassdoor that should be visited. Required fields are:
-
search
: (Optional) (String) Keyword that you want to search on Glassdoor. -
startUrls
: (Optional) (Array) List of Glassdoor URLs. It should be a company, salary, interview, job, search, or any listing URL. -
endPage
: (Optional) (Number) Final number of page that you want to scrape. The default isInfinite
. This applies to allsearch
requests andstartUrls
individually. -
maxItems
: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results. -
useFingerprints
: (Optional) (Boolean) Use Fingerprinting Generator to prevent detection and blocking of the scraper. If you are blocking dramatically, please try to use disabling this option as well. -
httpHeaders
: (Optional) (Object) Custom HTTP headers to be sent with each request. Leave empty to use default browser fingerprinting. Customizing headers can help prevent bot detection. -
proxy
: (Required) (Proxy Object) Proxy configuration. -
extendOutputFunction
: (Optional) (String) Function that takes a JQuery handle ($) as an argument and returns an object with data. -
customMapFunction
: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function.
This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.
Tip
When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl.
If you would like to scrape only the first page of a list then put the link for the page and have the endPage
as 1.
With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the endPage
parameter as 6 then you'll have the 5th and 6th pages only.
Compute Unit Consumption
The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in 2 minutes with ~0.25-0.45 compute units.
Glassdoor Scraper Input example
1{ 2 "startUrls":[ 3 "https://www.glassdoor.com/Overview/Working-at-Elastic-EI_IE751551.11,18.htm", 4 "https://www.glassdoor.com/Reviews/Elastic-Reviews-E751551.htm", 5 "https://www.glassdoor.com/Reviews/Elastic-Reviews-E751551.htm?filter.iso3Language=eng&filter.employmentStatus=REGULAR&filter.employmentStatus=PART_TIME&filter.searchCategory=CULTURE", 6 "https://www.glassdoor.com/Interview/Elastic-Interview-Questions-E751551.htm", 7 "https://www.glassdoor.com/Interview/Elastic-Marketing-Assistant-Interview-Questions-EI_IE751551.0,7_KO8,27.htm#InterviewReview_76289952", 8 "https://www.glassdoor.com/Job/elastic-jobs-SRCH_KO0,7.htm", 9 "https://www.glassdoor.com/job-listing/federal-account-executive-nga-elastic-JV_IC1138213_KO0,29_KE30,37.htm?jl=1008611278488&pos=107&ao=1136043&s=58&guid=000001884d59c10aafb263d8cfe0037f&src=GD_JOB_AD&t=SR&vt=w&ea=1&cs=1_cd6d3492&cb=1684924907959&jobListingId=1008611278488&jrtk=3-0-1h16ljg9dk6fr801-1h16ljg9ti9j2800-c74621bdf8ed430d-&ctt=1684925008345", 10 "https://www.glassdoor.com/Salary/Elastic-Software-Engineer-Salaries-E751551_D_KO8,25.htm" 11 ], 12 "httpHeaders": { 13 "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7", 14 "accept-language": "en-US,en;q=0.9", 15 "cache-control": "no-cache", 16 "pragma": "no-cache", 17 "priority": "u=0, i", 18 "sec-ch-ua": "\"Not/A)Brand\";v=\"8\", \"Chromium\";v=\"126\", \"Google Chrome\";v=\"126\"", 19 "sec-ch-ua-mobile": "?0", 20 "sec-ch-ua-platform": "\"Windows\"", 21 "sec-fetch-dest": "document", 22 "sec-fetch-mode": "navigate", 23 "sec-fetch-site": "none", 24 "sec-fetch-user": "?1", 25 "upgrade-insecure-requests": "1", 26 "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" 27 }, 28 "maxItems": 100, 29 "endPage": 5, 30 "useFingerprints": true, 31 "proxy":{ 32 "useApifyProxy":true 33 } 34}
During the Run
During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.
If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.
Glassdoor Export
During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.
You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this Glassdoor actor.
Scraped Glassdoor Properties
The structure of each item in Glassdoor looks like this:
Job Detail
1{ 2 "type": "job", 3 "url": "https://www.glassdoor.com/job-listing/federal-account-executive-nga-elastic-JV_IC1138213_KO0,29_KE30,37.htm?jl=1008611278488&pos=107&ao=1136043&s=58&guid=000001884d59c10aafb263d8cfe0037f&src=GD_JOB_AD&t=SR&vt=w&ea=1&cs=1_cd6d3492&cb=1684924907959&jobListingId=1008611278488&jrtk=3-0-1h16ljg9dk6fr801-1h16ljg9ti9j2800-c74621bdf8ed430d-&ctt=1684925008345", 4 "title": "Federal Account Executive - NGA", 5 "body": "Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere. ", 6 "bodyHTML": "<div><div><p>Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere.</p></div></div>", 7 "jobLocation": { 8 "country": "United States", 9 "cityName": "Washington, DC", 10 "stateName": "District of Columbia" 11 }, 12 "company": { 13 "id": 1827719, 14 "website": "https://elastic.com", 15 "name": "Elastic", 16 "location": "Mountain View, CA", 17 "size": "1001 to 5000 Employees", 18 "type": "Company - Public", 19 "sector": "Information Technology", 20 "founded": 2012, 21 "industry": "Enterprise Software & Network Solutions", 22 "revenue": "$100 to $500 million (USD)", 23 "ratings": [ 24 { 25 "title": "Overall", 26 "rating": 3.7 27 }, 28 { 29 "title": "CEO", 30 "rating": 0.65 31 }, 32 { 33 "title": "CEO Ratings Count", 34 "rating": 151 35 }, 36 { 37 "title": "Recommend", 38 "rating": 0.65 39 }, 40 { 41 "title": "Career Opportunities", 42 "rating": 3.4 43 }, 44 { 45 "title": "Compensation & Benefits", 46 "rating": 4.2 47 }, 48 { 49 "title": "Culture & Values", 50 "rating": 3.8 51 }, 52 { 53 "title": "Senior Management", 54 "rating": 3.2 55 }, 56 { 57 "title": "Work/Life Balance", 58 "rating": 4.1 59 } 60 ] 61 } 62}
Review Detail
1{ 2 "type": "review", 3 "url": "https://www.glassdoor.com/Reviews/Employee-Review-Elastic-RVW76656731.htm", 4 "title": "Company", 5 "isCurrentEmployee": true, 6 "date": "2023-05-22T05:45:39.153", 7 "isAnonymousEmployee": false, 8 "ratings": [ 9 { 10 "title": "Work/Life Balance", 11 "rating": 4 12 }, 13 { 14 "title": "Overall", 15 "rating": 5 16 }, 17 { 18 "title": "Culture & Values", 19 "rating": 4 20 }, 21 { 22 "title": "Diversity & Inclusion", 23 "rating": 4 24 }, 25 { 26 "title": "Senior Management", 27 "rating": 5 28 }, 29 { 30 "title": "Career Opportunities", 31 "rating": 4 32 }, 33 { 34 "title": "Compensation and Benefits", 35 "rating": 5 36 } 37 ], 38 "pros": "Excellent leadership, product, and company vision.", 39 "cons": "We have been through many leadership transitions but are on the other side.", 40 "reviews": [ 41 { 42 "title": "Recommend", 43 "isOK": true 44 }, 45 { 46 "title": "CEO Approval", 47 "isOK": true 48 }, 49 { 50 "title": "Business Outlook", 51 "isOK": true 52 } 53 ] 54}
Salary Detail
1{ 2 "type": "salary", 3 "title": "Senior Software Engineer", 4 "location": "Los Angeles, CA", 5 "totalPay": { 6 "lower": 172000, 7 "upper": null 8 }, 9 "base": 172000, 10 "additional": 0, 11 "stock": null, 12 "yearsOfExperience": "15+ years", 13 "submittedDate": "Dec 4, 2022" 14}
Company Detail
1{ 2 "type": "company", 3 "url": "https://www.glassdoor.com/Overview/Working-at-Elastic-EI_IE751551.11,18.htm", 4 "id": 751551, 5 "website": "https://www.elastic.co", 6 "logo": "https://media.glassdoor.com/sqls/751551/elastic-squarelogo-1559879227505.png", 7 "name": "Elastic", 8 "size": "1001 to 5000 Employees", 9 "companyType": "Company - Public", 10 "revenue": "$100 to $500 million (USD)", 11 "stock": "ESTC", 12 "headquarters": "San Francisco, CA", 13 "founded": 2012, 14 "industry": "Enterprise Software & Network Solutions", 15 "competitors": [], 16 "description": "At Elastic, we see endless possibility in a world of endless data. And we use the power of search to help people and organizations turn that possibility into results. \n\nElastic is the leading platform for search-powered solutions. We help organizations, their employees, and their customers accelerate the results that matter. With solutions in Enterprise Search, Observability, and Security, we help enhance customer and employee search experiences, keep mission-critical applications running smoothly, and protect against cyber threats.\n\nDelivered wherever data lives, in one cloud, across many clouds, or on-prem, Elastic enables organizations worldwide to use the power of Elastic, including Netflix, Uber, BBC, Microsoft, and thousands of others.\n\nElastic was built on a foundation of being free and open, which trickles down to how we work. We’re a distributed organization and have been from the beginning. Being distributed isn’t just a way of doing business—it’s a mindset that is at the core of our culture. \n\nElastic is publicly traded on the NYSE under the symbol ESTC. Learn more at elastic.co.", 17 "mission": "We help people around the world do great things with their data. Our products are extending what's possible with data, and deliver on the promise that good things come from connecting the dots.", 18 "awards": [ 19 { 20 "name": "100 Large Best Places to Work New York City", 21 "year": 2023, 22 "awardedBy": "Builtin" 23 }, 24 ], 25 "ratings": [ 26 { 27 "title": "Overall", 28 "rating": 3.8 29 }, 30 { 31 "title": "CEO", 32 "rating": 0.67 33 }, 34 { 35 "title": "CEO Ratings Count", 36 "rating": 210 37 }, 38 { 39 "title": "Positive Business Outlook", 40 "rating": 0.67 41 }, 42 { 43 "title": "Recommend", 44 "rating": 0.71 45 }, 46 { 47 "title": "Career Opportunities", 48 "rating": 3.5 49 }, 50 { 51 "title": "Compensation & Benefits", 52 "rating": 4.3 53 }, 54 { 55 "title": "Culture & Values", 56 "rating": 4 57 }, 58 { 59 "title": "Senior Management", 60 "rating": 3.3 61 }, 62 { 63 "title": "Work/Life Balance", 64 "rating": 4.1 65 }, 66 { 67 "title": "Review Count", 68 "rating": 680 69 }, 70 { 71 "title": "Diversity & Inclusion", 72 "rating": 4.1 73 } 74 ] 75}
Interview Detail
1{ 2 "type": "interview", 3 "url": "https://www.glassdoor.com/Interview/Elastic-Marketing-Assistant-Interview-Questions-EI_IE751551.0,7_KO8,27.htm#InterviewReview_76289952", 4 "reviewDate": "2023-05-10T13:12:11.153", 5 "id": 76289952, 6 "title": "Marketing Assistant", 7 "source": null, 8 "location": "Switzerland", 9 "interviewDate": null, 10 "process": "The interview process is very organised and everyone involved is very positive and supportive. In case you do not get the position you applied for, they give very detailed feedback.", 11 "questions": [ 12 "Tell us about your background and why you would be a good fit for this role?" 13 ], 14 "reviews": [ 15 { 16 "title": "Offer", 17 "result": "NO_OFFER" 18 }, 19 { 20 "title": "Experience", 21 "result": "POSITIVE" 22 }, 23 { 24 "title": "Interview", 25 "result": "AVERAGE" 26 } 27 ] 28}
Contact
Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.