text-from-pdf
v1.1.2
Published
A PDF to Text Extractor
Downloads
15,137
Maintainers
Readme
PDF-TO-TEXT
A pdf to text wrapper to extract text from a pdf. It works with searchable and non-searchable(images) PDFs
Installation
npm install text-from-pdf
Mac Users
brew install poppler
Linux Users
sudo apt-get update && sudo apt-get install poppler-utils
Windows Users
No installation required
Usage
- Standard Input PDF with horizontally aligned text:
const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>'); console.log(text)
- Input PDF's with vertically aligned text:
const options = { rotationDegree: -90, }; $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options); $ console.log(text)
- Text from first and second page:
const options = { firstPageToConvert: 1, lastPageToConvert: 2, }; $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options); $ console.log(text)
- Text from third to fifth page:
const options = { firstPageToConvert: 3, lastPageToConvert: 5, }; $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options); $ console.log(text)
- Enable Progressbar logging:
const options = { firstPageToConvert: 1, lastPageToConvert: 1, enableProgressBarLogging: true }; $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options); $ console.log(text)
Features request
Fork, add your changes and create a pull request