PHAPI - Extract video search results from Pornhub
Pay $5.00 for 1,000 results
PHAPI - Extract video search results from Pornhub
Pay $5.00 for 1,000 results
A scraper for the popular adult entertainment platform Pornhub. Supports search queries and any user-configurable filters. Download your data as HTML table, JSON, CSV, XML, Excel, RSS, or JSONL.
PHAPI
The well PornHub API, or phapi for short, is a web-scraper for the popular adult entertainment website PornHub.
It has built-in support for scraping any search, bypassing most of the platform's anti-scraping measures.
While it's configurable, don't disable the proxy, as it seems that apify's datacenters are located in Virgina,
where PH's services are completely banned, currently.
Apart from this, there's no need to use any of the specialized proxies, apify's default datacenter proxies work well, with the scraper handling rotations, etc. automatically (e.g. if any of the URLs are blocked, it will automatically switch to a new one).
Usage
The only option the scraper needs is the search query, all others, e.g. sorting or the number of pages are optional and their defaults are explained in the input configuration.
(Note: the default values mirror PH's defaults)
Queries are automatically escaped, in the same way that PH does it, so you can pass in any query, and it will be correctly escaped.
Output
The output is a dataset with the following schema (using zod):
1const UserInformationShortSchema = z.object({ 2 type: z.string(), 3 slug: z.string(), 4 name: z.string(), 5 profilePicture: z.string().optional(), 6 isContentPartner: z.boolean(), 7 isVerified: z.boolean(), 8 isPremium: z.boolean(), 9 isAwardsWinner: z.boolean(), 10}) 11 12const ResultCountSchema = z.object({ 13 from: z.number(), 14 to: z.number(), 15 total: z.number(), 16}); 17 18const SearchCategorySchema = z.object({ 19 title: z.string(), 20 slug: z.string(), 21 id: z.number(), 22 count: z.number(), // Number of videos in this search that fall under this category 23 video_count: z.number(), // Total number of videos in this category 24}); 25 26const VideoPreviewSchema = z.object({ 27 videoId: z.string(), // Not sure if this is unique 28 segment: z.string(), 29 viewKey: z.string(), // The id used to view the video, e.g. .../view_video.php?viewkey=... 30 title: z.string().optional(), 31 thumbnail: z.string().optional(), 32 duration: z.object({ 33 hours: z.number(), 34 minutes: z.number(), 35 seconds: z.number(), 36 }), 37 uploader: UserInformationShortSchema, 38 viewCount: z.number(), 39 rating: z.number(), 40}); 41 42const SearchOutputSchema = z.object({ 43 page: z.number(), 44 video: VideoPreviewSchema, 45 resultCount: ResultCountSchema, 46 correctionSuggestion: z.string().optional(), 47 categories: z.array(SearchCategorySchema), 48});
Most of the properties should be self-explanatory, but here's a quick rundown:
The result count is the number of results on a page (and the total), the correction suggestion (if not an empty string, or missing) is the suggestion that PH gives if the search query is misspelled, etc. The categories are the categories are the categories shown on the left side, of the search-page, so all categories filtered by the search query, as well as the number of elements of the search that fall under this category.
The videos are the videos on the page, with the uploader being a short version of the uploader's information, containing the type (e.g. model, pornstar, user or channel), the name, the slug, and the profile picture (if available).
If the uploader is a model or pornstar, they can have additional information, such as whether they are verified, premium, or an awards winner, the content-partner status is only applicable to channels (at least it seems like it is - if this changes from PH's side, this will be passed through as well).
What are results and what does the pricing mean?
Each result corresponds to a single video - since there's multiple videos per page, the other properties of the result are the same for all videos on the page.
Each page usually contains between 32 and 44 videos (first and all subsequent pages, the last one obviously can have less), so if you set the number of pages to 5, you should get around ~209 videos.\
Example - correction suggestion
So if you want to build something that takes in a user's query, then corrects it and then scrapes the corrected query, you can do so by setting the number of pages to 1, and then checking if the correction suggestion is not an empty string, and then using that as the new query.
Actor Metrics
8 monthly users
-
1 star
>99% runs succeeded
Created in Aug 2024
Modified 4 months ago