npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

page-piranha

v0.1.3

Published

LLM based page parser (PDF to text)

Downloads

25

Readme

Page Piranha 🦈

Use LLMs to convert PDFs to text, markdown, or JSON. By default, page piranha uses the Gemini 2.0 Flash model.

Features

  • Convert PDFs to plain text, markdown, or JSON
  • Support for local files and remote URLs
  • Pipe-friendly CLI interface
  • Progress indicators and colorful output
  • Configurable output directory
  • Optional custom prompts for fine-tuned conversions

Installation

npm install page-piranha

Environment Setup

Page Piranha requires Google Cloud Platform credentials to use Vertex AI. Create a .env file with the following variables:

GCP_PROJECT=your_gcp_project
GCP_LOCATION=your_gcp_location
GOOGLE_APPLICATION_CREDENTIALS=path_to_your_gcp_credentials_file

CLI Usage

Basic usage:

.bin/page-piranha -f input.pdf -m text -o output

Options:

  • -f, --file <file> - The PDF file to convert (required)
  • -m, --mode <mode> - Conversion mode: text, markdown, or json (default: text)
  • -o, --outDir <directory> - Output directory (default: out)
  • -t, --tee - Output to both file and stdout
  • -v, --verbose - Enable verbose logging
  • -p, --prompt <prompt> - Additional hints for conversion

Examples:

Convert to text

.bin/page-piranha -f document.pdf -m text

Convert to markdown with custom output directory

.bin/page-piranha -f document.pdf -m markdown -o converted

Convert to JSON and pipe to jq

.bin/page-piranha -f assets/demo.pdf -m json -p "Make sure to use camel case. This is an invoice. Feel free to nest fields" -t | jq

Programmatic Usage

Page Piranha can be used programmatically in your TypeScript/JavaScript projects:

import { PagePiranha } from 'page-piranha';
import { JorEl } from 'jorel';

// Initialize
const jorEl = new JorEl({ vertexAi: true });
const piranha = new PagePiranha(jorEl);

// Convert to text
const text = await piranha.toText('document.pdf');

// Convert to markdown with additional prompt
const markdown = await piranha.toMarkdown('document.pdf', 'Focus on headers and lists');

// Convert to JSON
const json = await piranha.toJson('document.pdf');

API Reference

PagePiranha Class

Constructor

  • constructor(jorEl: JorEl, options?: PagePiranhaOptions)

Methods

  • toText(fileOrFiles: string | Buffer, additionalPrompt?: string): Promise<string>
  • toMarkdown(fileOrFiles: string | Buffer, additionalPrompt?: string): Promise<string>
  • toJson(fileOrFiles: string | Buffer, additionalPrompt?: string): Promise<object>

License

MIT