npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

pdf-img-convert

v2.0.0

Published

A simple module to convert PDF files into image buffers (without filesystem usage)

Downloads

121,725

Readme

pdf-img-convert.js

A lightweight JavaScript package for converting PDFs into images, built on Mozilla's powerful PDF.js library.

Motivation

While there are numerous JavaScript solutions for converting PDFs, many rely heavily on the filesystem by creating temporary files and using non-native binaries like Ghostscript.

This solution simplifies the process by exclusively using JavaScript arrays, significantly streamlining the pipeline and, ideally, enhancing performance.

Installation

npm install pdf-img-convert

Usage

The package by default returns an Array of Uint8Array objects, each of which represents an image encoded in png format or a base64-encoded image output if required

Here are some examples of how to use the module. First, make sure to import it

const pdf2img = await import("pdf-img-convert");

The package provides a single function, convert, which accepts the following PDF formats as input:

  • URL of a PDF (e.g., www.example.com/sample.pdf)

  • Path to a local PDF file (e.g., ./examples/sample.pdf)

  • A Buffer object containing PDF data

  • A Uint8Array object containing PDF data

  • Base64-encoded PDF data

Note: The convert function is asynchronous and returns a Promise object.

The output can be manipulated using the conversion_config argument mentioned below.

Here's an example of how to use it in synchronous code, uint8arrays as the default output:

import fs from 'fs';
import path from 'path';

async function processPDFs() {
    const pdf2img = await import("pdf-img-convert");

    // Both HTTP, HTTPS, and local paths are supported
    const outputWithExternalLink = await pdf2img.convert('https://sedl.org/afterschool/toolkits/science/pdf/ast_sci_data_tables_sample.pdf');
    const outputWithLocalSample = await pdf2img.convert('./test_pdfs/sample.pdf');

    // OUTPUT OPTIONS ARRAY
    const outputs = [outputWithExternalLink, outputWithLocalSample];
    
    const pdfArray = outputs[0]; // Change the index to select different outputs

    function saveImages(pdfArray) {
        pdfArray.forEach((image, index) => {
            const outputPath = path.join('./outputImages', `saveImages_${index}.png`);
            fs.writeFile(outputPath, image, (error) => {
                if (error) {
                    console.error(`Error saving image ${index + 1}:`, error);
                } else {
                    console.log(`Image ${index + 1} saved successfully`);
                }
            });
        });
    }
    // Call the function to save images
    saveImages(pdfArray);
}
// Call the async function
processPDFs();

Here's an example of how to use it in synchronous code, base64-encoded output as the chosen option:

import fs from 'fs';
import path from 'path';

let config = {
    base64: true,
    scale: 2
};
async function processPDF() {
    const pdf2img = await import("pdf-img-convert");
    const outputWithLocalSampleAndConfig = await pdf2img.convert('./examples/example.pdf', config);
    const pdfArray = outputWithLocalSampleAndConfig;
    function saveBase64Images(pdfArray) {
        console.log('Processing base64Images...');
        pdfArray.forEach((base64Data, index) => {
            // Convert Base64 string to binary buffer
            const buffer = Buffer.from(base64Data, 'base64');
            // Define an output path
            const outputPath = path.join('./outputImages', `saveBase64Image_${index}.png`);
            // Write the buffer to a PNG file
            fs.writeFile(outputPath, buffer, (error) => {
                if (error) {
                    console.error(`Error saving image ${index + 1} (base64):`, error);
                } else {
                    console.log(`Image ${index + 1} (base64) saved successfully`);
                }
            });
        });
    }
    // Example usage: call this function with the Base64 array
    saveBase64Images(pdfArray);
}
// Call the async function
processPDF();

Here's an example of how to use it in Next.js api folder, base64-encoded output as the chosen option:

Ensure you have a dotenv file with ROOT_PATH pointing to your root directory

ROOT_PATH=./

You have a /public/uploads directory in the root entry of your project, if not create it.

import path from "path";
import fs from "fs";

const UPLOAD_DIR = path.resolve(process.env.ROOT_PATH ?? "", "public/uploads");

/*SHARP ENHANCEMENT*/
const writeToDir = async (imageData, fileName, filePath) => {
    try {
        await fs.promises.writeFile(filePath, imageData);
        console.log(`Image processed and saved successfully: ${filePath}`);
        return {
            fileName: fileName,
            filePath: filePath
        };
    } catch (error) {
        console.error("Error processing the image:", error);
        return null;
    }
};
export const POST = async (req: NextRequest) => {
    const pdf2img = await import("pdf-img-convert");
    const formData = await req.formData();
    const pdfFiles = []
    const convertedImages = []

    // Loop over the FormData content
    for (const [key, value] of formData.entries()) {
        if (value instanceof Blob) {
            if (key === "pdfFiles") {
                pdfFiles.push(value);
            }
        }
    }
    // check for file uploads
    if (pdfFiles.length === 0) {
        return NextResponse.json({
            success: false,
            message: "No files uploaded"
        });
    }
    for (const file of pdfFiles) {
        const pdfBuffer = Buffer.from(await file.arrayBuffer());
        const imagePages = await pdf2img.convert(pdfBuffer, { base64: true }); // Ensure base64 is true to get Base64 data

        for (let i = 0; i < imagePages.length; i++) {
            const imageData = imagePages[i];
            const base64Data = imageData.replace(/^data:image\/\w+;base64,/, ""); // Remove Base64 header
            const imageBuffer = Buffer.from(base64Data, "base64"); // Convert Base64 to Buffer

            const fileName = `${path.basename(file.name, path.extname(file.name))}_page_${i + 1}.png`;
            const filePath = path.join(UPLOAD_DIR, fileName);

            const writeFile = await writeToDir(imageBuffer, fileName, filePath);
            if (writeFile) {
                convertedImages.push(writeFile);
            }
        }
    }
    // if successfully
    return NextResponse.json({
        success: true,
        message: "Files uploaded successfully",
    });
}

Here is how to fetch single or multiple PDF documents from the client side using a .jsx or .tsx component:

In this case my api route points to /app/api/pdf2img, you can change it to suite your own naming

"use client";
import React, { useRef, useState } from "react";
export default function PDFUploader() {
    const MAX_FILES = 4;
    const [fileNames, setFileNames] = useState([]);
    const [pdfFiles, setPdfFiles] = useState([]);
    const [errorMessage, setErrorMessage] = useState("");
    const fileInputRef = useRef(null);

    const handleFileChange = (e) => {
        const files = Array.from(e.target.files);
        if (files.length > MAX_FILES) {
            setErrorMessage(`You can only upload a maximum of ${MAX_FILES} files.`);
            clearStates();
        } else {
            const names = files.map((file) => file.name);
            const pdfs = files.filter(file => file.type === "application/pdf");
            setStates(pdfs, names);
        }
    };

    const handleSubmit = async (e) => {
        e.preventDefault();
        if (pdfFiles.length === 0) {
            setErrorMessage("No PDF files selected.");
            return;
        }
        try {
            const formData = new FormData();
            pdfFiles.forEach((file) => {
                formData.append("pdfFiles", file); // Key is "pdfFiles"
            });

            const response = await fetch("/api/pdf2img", {
                method: "POST",
                body: formData
            });

            const result = await response.json();
            if (result.success) {
                clearStates();
                alert(`Upload successful: ${result.message}`);
            } else {
                alert(`Upload failed: ${result.message}`);
            }
        } catch (error) {
            console.error("Upload error:", error);
            alert("An error occurred during upload.");
        }
    };

    const clearStates = () => {
        setFileNames([]);
        setPdfFiles([]);
        if (fileInputRef.current) {
            fileInputRef.current.value = ""; // Reset the file input via ref
        }
    };

    const setStates = (pdfs, names) => {
        setPdfFiles(pdfs);
        setFileNames(names);
    };

    return (
        <html>
        <body style={{padding:0, left: 0}}>
        <form onSubmit={handleSubmit}>
            <input
                ref={fileInputRef}
                onChange={handleFileChange}
                type="file"
                name="fileUpload"
                multiple
                accept=".pdf"
            />
            <button type="submit">Submit</button>
        </form>
        {errorMessage && (
            <div>
                <p style={{ textAlign: "center", color: "red" }}>{errorMessage}</p>
            </div>
        )}
        </body>
        </html>
    );
}

There is also an optional second conversion_config argument which accepts an object like this:

{
  width: 100 //Number in px
  height: 100 // Number in px
  scale: 2
  page_numbers: [1, 2, 3] // A list of pages to render instead of all of them
  base64: true
}

The following attributes are optional and can be omitted from the configuration object:

  • width or height: Controls the scale of the output images. If both are provided, only width will be used, and height will be ignored.

  • page_numbers: Specifies which pages to render. Pages are 1-indexed.

  • base64: Set to true to receive base64-encoded image output. If omitted or set to false, the output will be an array of Uint8Arrays.

  • scale: Defines the viewport scale ratio, defaulting to 1 (original width and height).

Contributing

If you'd like to contribute, please do submit a pull request!

Once you've finalised your changes, please include a summary of these changes under the [Unreleased] section of CHANGELOG.md in this format.