PaperbackSwap Scraper avatar
PaperbackSwap Scraper
Try for free

3 days trial then $20.00/month - No credit card required now

View all Actors
PaperbackSwap Scraper

PaperbackSwap Scraper

epctex/paperbackswap-scraper
Try for free

3 days trial then $20.00/month - No credit card required now

Get information on books right away from PaperBack Swap! Get title, prices, rating, ISBN13, ISBN10, publisher, and a lot more are waiting for you. Use any filter, search anything, and retrieve your results without any limits. Get it with JSON, XML, Excel, CSV, or many other options. Super fast!

Actor - PaperBack Swap Scraper

PaperBack Swap scraper

Since PaperBack Swap doesn't provide a good and free API, this actor should help you to retrieve data from it.

The PaperBack Swap data scraper supports the following features:

  • Search any keyword - Search any keyword, find the books you are looking for. Retrieve everything right away!

  • Scrape books - Name, description, ISBNs, prices, cover image, author, publisher, and many other things are at your fingertips.

  • Get comments on any book - Get all the comments and ratings for any book.

  • Retrieve books with certain tags - If you are looking for all the books with certain tags; you are in the right place. Just type the URL and the actor will do the rest.

  • Fetch book awards - Get all the awarded books right away!

  • Scrape books of an author - All the books that have been written by an author can be easily scraped.

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on PaperBack Swap that should be visited. Possible fields are:

  • search: (Optional) (String) Keyword that you want to search on PaperBack Swap.

  • startUrls: (Optional) (Array) List of PaperBack Swap URLs. You should only provide awards, tags, author, or book detail URLs.

  • includeReviews: (Optional) (Boolean) This will add all the reviews that PaperBack Swap provides into the detail objects. Please keep in mind that the time and resources the actor uses will increase proportionally by the number of reviews.

  • endPage: (Optional) (Number) Final number of page that you want to scrape. The default is Infinite. This applies to all search requests and startUrls individually.

  • maxItems: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results.

  • proxy: (Required) (Proxy Object) Proxy configuration.

  • extendOutputFunction: (Optional) (String) Function that takes a JQuery handle ($) as an argument and returns an object with data.

  • customMapFunction: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Tip

When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl.

If you would like to scrape only the first page of a list then put the link for the page and have the endPage as 1.

With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the endPage parameter as 6 then you'll have the 5th and 6th pages only.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in 2 minutes with ~0.03-0.05 compute units.

PaperBack Swap Scraper Input example

1{
2  "startUrls":[
3    "https://www.paperbackswap.com/5th-Wave-Bk-Rick-Yancey/book/0142425834/",
4    "https://www.paperbackswap.com/Pulitzer-Prize/award/2/?g=History",
5    "https://www.paperbackswap.com/Mark-Sanborn/author/",
6    "https://www.paperbackswap.com/Christian-Fiction/tag/2043/?l=10&ls=10"
7  ],
8  "search":"harry potter",
9  "endPage":1,
10  "maxItems":10,
11  "includeReviews": false,
12  "proxy":{
13    "useApifyProxy":true
14  }
15}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

PaperBack Swap Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this PaperBack Swap actor.

Scraped PaperBack Swap Properties

The structure of each item in PaperBack Swap looks like this:

Item Detail

1{
2    "url": "https://www.paperbackswap.com/Fantasy-Lover-Dark-Sherrilyn-Kenyon/book/0312979975/",
3    "name": "Fantasy Lover",
4    "subtitle": "Dark-Hunter, Bk 1",
5    "description": "Dear Reader, — Being trapped in a bedroom with a woman is a grand thing. Being trapped in hundreds of bedrooms over two thousand years isn't. And being cursed into a book as a love-slave for eternity can ruin even a Spartan warrior's day. — As a love-slave, I knew everything about women. How to touch them, how to savor them, and most of al...  more »l how to pleasure them. But when I was summoned to fulfill Grace Alexander's sexual fantasies, I found the first woman in history who saw me as a man with a tormented past. She, alone, bothered to take me out of the bedroom and into the world. She taught me to love again.\n\nBut I was not born to know love. I was cursed to walk eternity alone. As a general, I had long ago accepted my sentence. Yet now I have found Grace-the one thing my wounded heart cannot survive without. Sure, love can heal all wounds, but can it break a two thousand year old curse?\n\nJulian of Macedon  « less",
6    "author": {
7        "name": "Sherrilyn Kenyon",
8        "url": "https://www.paperbackswap.com/Sherrilyn-Kenyon/author/"
9    },
10    "publisher": "St. Martin's Paperbacks",
11    "bookFormat": "Mass Market Paperback",
12    "image": "https://nationalbookswap.com/pbs/l/73/9973/9780312979973.jpg",
13    "amazonBuyLink": "https://www.amazon.com/gp/product/0312979975?SubscriptionId=AKIAJXCVBYSZT4DROVVA&tag=pbs_00005-20&linkCode=xm2&camp=2025&creative=165953&creativeASIN=0312979975",
14    "isbn13": "9780312979973",
15    "isbn10": "0312979975",
16    "numberOfPages": "352",
17    "rating": "4.2",
18    "numberOfRatings": "1366",
19    "highPrice": "",
20    "lowPrice": "",
21    "reviews": [
22        {
23            "author": "Rebecca H. (Goldaira)",
24            "content": "I don't normally read romance novels of any kind.  I find them unbelievable and boring.  I only read this book because it was a recommendation by a friend and she dared me to read and see if I didn't like it.  I didn't like it, I loved it.  This is a fabulous book that keeps your interest.  Very enjoyable read, even for people who don't normally read this kind of stuff.  I am going to be reading more of this authors work.",
25            "publishDate": "12/20/2007",
26            "rating": 0
27        },
28    ]
29}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.

Developer
Maintained by Community
Actor metrics
  • 1 monthly users
  • 100.0% runs succeeded
  • Created in Apr 2023
  • Modified about 13 hours ago