Advanced Glassdoor Scraper avatar

Advanced Glassdoor Scraper

Deprecated
View all Actors
This Actor is deprecated

This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?

See alternative Actors
Advanced Glassdoor Scraper

Advanced Glassdoor Scraper

epctex/advanced-glassdoor-scraper

The most advanced Glassdoor Scraper that you would ever need. Extract millions of companies, salaries, interviews, jobs, and reviews from Glassdoor. You can specify search terms, filters, list pages, and more! Extremely fast, with no limits. Super easy to use!

Actor - Glassdoor Scraper

Glassdoor scraper

Since Glassdoor doesn't provide a good and free API, this actor should help you to retrieve data from it.

The Glassdoor data scraper supports the following features:

  • Search any keyword - You can search any keyword you would like to have and get the results. No limits!

  • Get extensive company information - Get all information about a company right away!

  • Looking for jobs? You are at the right place! - Retrieve all the job information of a company supported by filters. Extremely fast data retrieval!

  • All detailed salary information is at your fingertips! - Salary information and all the detailed pricing is at your service!

  • Company reviews! - All the extended reviews, company information, ratings, and many other stuff!

  • Interviews - Retrieve all the interviews of a company. Super simple, easy usage!

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on Glassdoor that should be visited. Required fields are:

  • search: (Optional) (String) Keyword that you want to search on Glassdoor.

  • startUrls: (Optional) (Array) List of Glassdoor URLs. It should be a company, salary, interview, job, search, or any listing URL.

  • endPage: (Optional) (Number) Final number of page that you want to scrape. The default is Infinite. This applies to all search requests and startUrls individually.

  • maxItems: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results.

  • useFingerprints: (Optional) (Boolean) Use Fingerprinting Generator to prevent detection and blocking of the scraper. If you are blocking dramatically, please try to use disabling this option as well.

  • httpHeaders: (Optional) (Object) Custom HTTP headers to be sent with each request. Leave empty to use default browser fingerprinting. Customizing headers can help prevent bot detection.

  • proxy: (Required) (Proxy Object) Proxy configuration.

  • extendOutputFunction: (Optional) (String) Function that takes a JQuery handle ($) as an argument and returns an object with data.

  • customMapFunction: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Tip

When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl.

If you would like to scrape only the first page of a list then put the link for the page and have the endPage as 1.

With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the endPage parameter as 6 then you'll have the 5th and 6th pages only.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in 2 minutes with ~0.25-0.45 compute units.

Glassdoor Scraper Input example

1{
2  "startUrls":[
3    "https://www.glassdoor.com/Overview/Working-at-Elastic-EI_IE751551.11,18.htm",
4    "https://www.glassdoor.com/Reviews/Elastic-Reviews-E751551.htm",
5    "https://www.glassdoor.com/Reviews/Elastic-Reviews-E751551.htm?filter.iso3Language=eng&filter.employmentStatus=REGULAR&filter.employmentStatus=PART_TIME&filter.searchCategory=CULTURE",
6    "https://www.glassdoor.com/Interview/Elastic-Interview-Questions-E751551.htm",
7    "https://www.glassdoor.com/Interview/Elastic-Marketing-Assistant-Interview-Questions-EI_IE751551.0,7_KO8,27.htm#InterviewReview_76289952",
8    "https://www.glassdoor.com/Job/elastic-jobs-SRCH_KO0,7.htm",
9    "https://www.glassdoor.com/job-listing/federal-account-executive-nga-elastic-JV_IC1138213_KO0,29_KE30,37.htm?jl=1008611278488&pos=107&ao=1136043&s=58&guid=000001884d59c10aafb263d8cfe0037f&src=GD_JOB_AD&t=SR&vt=w&ea=1&cs=1_cd6d3492&cb=1684924907959&jobListingId=1008611278488&jrtk=3-0-1h16ljg9dk6fr801-1h16ljg9ti9j2800-c74621bdf8ed430d-&ctt=1684925008345",
10    "https://www.glassdoor.com/Salary/Elastic-Software-Engineer-Salaries-E751551_D_KO8,25.htm"
11  ],
12  "httpHeaders": {
13    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
14    "accept-language": "en-US,en;q=0.9",
15    "cache-control": "no-cache",
16    "pragma": "no-cache",
17    "priority": "u=0, i",
18    "sec-ch-ua": "\"Not/A)Brand\";v=\"8\", \"Chromium\";v=\"126\", \"Google Chrome\";v=\"126\"",
19    "sec-ch-ua-mobile": "?0",
20    "sec-ch-ua-platform": "\"Windows\"",
21    "sec-fetch-dest": "document",
22    "sec-fetch-mode": "navigate",
23    "sec-fetch-site": "none",
24    "sec-fetch-user": "?1",
25    "upgrade-insecure-requests": "1",
26    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"
27  },
28  "maxItems": 100,
29  "endPage": 5,
30  "useFingerprints": true,
31  "proxy":{
32    "useApifyProxy":true
33  }
34}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

Glassdoor Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this Glassdoor actor.

Scraped Glassdoor Properties

The structure of each item in Glassdoor looks like this:

Job Detail

1{
2    "type": "job",
3    "url": "https://www.glassdoor.com/job-listing/federal-account-executive-nga-elastic-JV_IC1138213_KO0,29_KE30,37.htm?jl=1008611278488&pos=107&ao=1136043&s=58&guid=000001884d59c10aafb263d8cfe0037f&src=GD_JOB_AD&t=SR&vt=w&ea=1&cs=1_cd6d3492&cb=1684924907959&jobListingId=1008611278488&jrtk=3-0-1h16ljg9dk6fr801-1h16ljg9ti9j2800-c74621bdf8ed430d-&ctt=1684925008345",
4    "title": "Federal Account Executive - NGA",
5    "body": "Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere. ",
6    "bodyHTML": "<div><div><p>Elastic is a free and open search company that powers enterprise search, observability, and security solutions built on one technology stack that can be deployed anywhere.</p></div></div>",
7    "jobLocation": {
8        "country": "United States",
9        "cityName": "Washington, DC",
10        "stateName": "District of Columbia"
11    },
12    "company": {
13        "id": 1827719,
14        "website": "https://elastic.com",
15        "name": "Elastic",
16        "location": "Mountain View, CA",
17        "size": "1001 to 5000 Employees",
18        "type": "Company - Public",
19        "sector": "Information Technology",
20        "founded": 2012,
21        "industry": "Enterprise Software & Network Solutions",
22        "revenue": "$100 to $500 million (USD)",
23        "ratings": [
24            {
25                "title": "Overall",
26                "rating": 3.7
27            },
28            {
29                "title": "CEO",
30                "rating": 0.65
31            },
32            {
33                "title": "CEO Ratings Count",
34                "rating": 151
35            },
36            {
37                "title": "Recommend",
38                "rating": 0.65
39            },
40            {
41                "title": "Career Opportunities",
42                "rating": 3.4
43            },
44            {
45                "title": "Compensation & Benefits",
46                "rating": 4.2
47            },
48            {
49                "title": "Culture & Values",
50                "rating": 3.8
51            },
52            {
53                "title": "Senior Management",
54                "rating": 3.2
55            },
56            {
57                "title": "Work/Life Balance",
58                "rating": 4.1
59            }
60        ]
61    }
62}

Review Detail

1{
2    "type": "review",
3    "url": "https://www.glassdoor.com/Reviews/Employee-Review-Elastic-RVW76656731.htm",
4    "title": "Company",
5    "isCurrentEmployee": true,
6    "date": "2023-05-22T05:45:39.153",
7    "isAnonymousEmployee": false,
8    "ratings": [
9        {
10            "title": "Work/Life Balance",
11            "rating": 4
12        },
13        {
14            "title": "Overall",
15            "rating": 5
16        },
17        {
18            "title": "Culture & Values",
19            "rating": 4
20        },
21        {
22            "title": "Diversity & Inclusion",
23            "rating": 4
24        },
25        {
26            "title": "Senior Management",
27            "rating": 5
28        },
29        {
30            "title": "Career Opportunities",
31            "rating": 4
32        },
33        {
34            "title": "Compensation and Benefits",
35            "rating": 5
36        }
37    ],
38    "pros": "Excellent leadership, product, and company vision.",
39    "cons": "We have been through many leadership transitions but are on the other side.",
40    "reviews": [
41        {
42            "title": "Recommend",
43            "isOK": true
44        },
45        {
46            "title": "CEO Approval",
47            "isOK": true
48        },
49        {
50            "title": "Business Outlook",
51            "isOK": true
52        }
53    ]
54}

Salary Detail

1{
2    "type": "salary",
3    "title": "Senior Software Engineer",
4    "location": "Los Angeles, CA",
5    "totalPay": {
6        "lower": 172000,
7        "upper": null
8    },
9    "base": 172000,
10    "additional": 0,
11    "stock": null,
12    "yearsOfExperience": "15+ years",
13    "submittedDate": "Dec 4, 2022"
14}

Company Detail

1{
2	"type": "company",
3	"url": "https://www.glassdoor.com/Overview/Working-at-Elastic-EI_IE751551.11,18.htm",
4	"id": 751551,
5	"website": "https://www.elastic.co",
6	"logo": "https://media.glassdoor.com/sqls/751551/elastic-squarelogo-1559879227505.png",
7	"name": "Elastic",
8	"size": "1001 to 5000 Employees",
9	"companyType": "Company - Public",
10	"revenue": "$100 to $500 million (USD)",
11	"stock": "ESTC",
12	"headquarters": "San Francisco, CA",
13	"founded": 2012,
14	"industry": "Enterprise Software & Network Solutions",
15	"competitors": [],
16	"description": "At Elastic, we see endless possibility in a world of endless data. And we use the power of search to help people and organizations turn that possibility into results. \n\nElastic is the leading platform for search-powered solutions. We help organizations, their employees, and their customers accelerate the results that matter. With solutions in Enterprise Search, Observability, and Security, we help enhance customer and employee search experiences, keep mission-critical applications running smoothly, and protect against cyber threats.\n\nDelivered wherever data lives, in one cloud, across many clouds, or on-prem, Elastic enables organizations worldwide to use the power of Elastic, including Netflix, Uber, BBC, Microsoft, and thousands of others.\n\nElastic was built on a foundation of being free and open, which trickles down to how we work. We’re a distributed organization and have been from the beginning. Being distributed isn’t just a way of doing business—it’s a mindset that is at the core of our culture. \n\nElastic is publicly traded on the NYSE under the symbol ESTC. Learn more at elastic.co.",
17	"mission": "We help people around the world do great things with their data. Our products are extending what's possible with data, and deliver on the promise that good things come from connecting the dots.",
18	"awards": [
19		{
20			"name": "100 Large Best Places to Work New York City",
21			"year": 2023,
22			"awardedBy": "Builtin"
23		},
24	],
25	"ratings": [
26		{
27			"title": "Overall",
28			"rating": 3.8
29		},
30		{
31			"title": "CEO",
32			"rating": 0.67
33		},
34		{
35			"title": "CEO Ratings Count",
36			"rating": 210
37		},
38		{
39			"title": "Positive Business Outlook",
40			"rating": 0.67
41		},
42		{
43			"title": "Recommend",
44			"rating": 0.71
45		},
46		{
47			"title": "Career Opportunities",
48			"rating": 3.5
49		},
50		{
51			"title": "Compensation & Benefits",
52			"rating": 4.3
53		},
54		{
55			"title": "Culture & Values",
56			"rating": 4
57		},
58		{
59			"title": "Senior Management",
60			"rating": 3.3
61		},
62		{
63			"title": "Work/Life Balance",
64			"rating": 4.1
65		},
66		{
67			"title": "Review Count",
68			"rating": 680
69		},
70		{
71			"title": "Diversity & Inclusion",
72			"rating": 4.1
73		}
74	]
75}

Interview Detail

1{
2    "type": "interview",
3    "url": "https://www.glassdoor.com/Interview/Elastic-Marketing-Assistant-Interview-Questions-EI_IE751551.0,7_KO8,27.htm#InterviewReview_76289952",
4    "reviewDate": "2023-05-10T13:12:11.153",
5    "id": 76289952,
6    "title": "Marketing Assistant",
7    "source": null,
8    "location": "Switzerland",
9    "interviewDate": null,
10    "process": "The interview process is very organised and everyone involved is very positive and supportive. In case you do not get the position you applied for, they give very detailed feedback.",
11    "questions": [
12        "Tell us about your background and why you would be a good fit for this role?"
13    ],
14    "reviews": [
15        {
16            "title": "Offer",
17            "result": "NO_OFFER"
18        },
19        {
20            "title": "Experience",
21            "result": "POSITIVE"
22        },
23        {
24            "title": "Interview",
25            "result": "AVERAGE"
26        }
27    ]
28}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.

Developer
Maintained by Community