HTML to PDF Converter avatar
HTML to PDF Converter
Try for free

No credit card required

View all Actors
HTML to PDF Converter

HTML to PDF Converter

jancurn/url-to-pdf
Try for free

No credit card required

Loads a web page in headless Chrome using Puppeteer and prints it to PDF. The input is a JSON object and output is a PDF file.

Dockerfile

1# This is a template for a Dockerfile used to run acts in Actor system.
2# The base image name below is set during the act build, based on user settings.
3# IMPORTANT: The base image must set a correct working directory, such as /usr/src/app or /home/user
4FROM apify/actor-node-puppeteer
5
6# Second, copy just package.json and package-lock.json since it should be
7# the only file that affects "npm install" in the next step, to speed up the build
8COPY package*.json ./
9
10# Install NPM packages, skip optional and development dependencies to
11# keep the image small. Avoid logging too much and print the dependency
12# tree for debugging
13RUN npm --quiet set progress=false \
14 && npm install --only=prod --no-optional \
15 && echo "Installed NPM packages:" \
16 && (npm list --all || true) \
17 && echo "Node.js version:" \
18 && node --version \
19 && echo "NPM version:" \
20 && npm --version
21
22# Copy source code to container
23# Do this in the last step, to have fast build if only the source code changed
24COPY --chown=node:node . ./
25
26# NOTE: The CMD is already defined by the base image.
27# Uncomment this for local node inspector debugging:
28# CMD [ "node", "--inspect=0.0.0.0:9229", "main.js" ]

package.json

1{
2    "name": "apify-project",
3    "version": "0.0.1",
4    "description": "",
5    "author": "It's not you it's me",
6    "license": "ISC",
7    "dependencies": {
8        "apify": "latest"
9    },
10    "scripts": {
11        "start": "node main.js"
12    }
13}

main.js

1const Apify = require('apify');
2
3Apify.main(async () => {
4    console.log('Fetching input...');
5    const input = await Apify.getValue('INPUT');
6    if (!input || typeof(input.url) !== 'string') {
7        throw new Error('Input must be an object with the "url" property');
8    }
9    
10    console.log('Launching headless Chrome...');
11    const browser = await Apify.launchPuppeteer();
12    const page = await browser.newPage();
13    
14    console.log(`Loading page (url: ${input.url})...`);
15    await page.goto(input.url);
16    
17    if (input.sleepMillis > 0) {
18        console.log(`Sleeping ${input.sleepMillis} millis...`);
19        await new Promise((resolve) => setTimeout(resolve, input.sleepMillis));
20    }
21    
22    const opts = input.pdfOptions || {};
23    delete opts.path; // Don't store to file
24    console.log(`Printing to PDF (options: ${JSON.stringify(opts)})...`);
25    const pdfBuffer = await page.pdf(opts);
26    
27    console.log(`Saving PDF (size: ${pdfBuffer.length} bytes) to output...`);
28    await Apify.setValue('OUTPUT', pdfBuffer, { contentType: 'application/pdf' });
29    
30    const storeId = process.env.APIFY_DEFAULT_KEY_VALUE_STORE_ID;
31    
32    // NOTE: Adding disableRedirect=1 param, because for some reason Chrome doesn't allow pasting URLs to PDF
33    // that redirect into the browser address bar (yeah, wtf...)
34    console.log('PDF file has been stored to:');
35    console.log(`https://api.apify.com/v2/key-value-stores/${storeId}/records/OUTPUT?disableRedirect=1`);
36});
Developer
Maintained by Community
Actor metrics
  • 12 monthly users
  • 94.6% runs succeeded
  • Created in Nov 2017
  • Modified 6 months ago
Categories