fork-pdf-parse-with-pagepertext
v1.1.2
Published
Pure javascript cross-platform module to extract text from PDFs.
Downloads
207
Maintainers
Readme
fork-pdf-parse-with-pagepertext
Pure javascript cross-platform module to extract texts from PDFs.
info
this is a fork of https://gitlab.com/autokent/pdf-parse. All Credits to mehmet.kozan. I forked the package to add the stuff in "Basic Usage". If you want to use the basic package and don't need the text of every page for indexing. I'm just publishing the package to enable hosting on one of my elastic beanstalk instances without giving access to the private repo, so it can easy autoscale. Thanks so much for the package, i hope this could help. I will also start a PR on your project.
Installation
npm install pdf-parse
Basic Usage - Local Files
const fs = require('fs');
const pdf = require('pdf-parse');
let dataBuffer = fs.readFileSync('path to PDF file...');
pdf(dataBuffer).then(function(data) {
// PDF Text Per Page (Array with {page: number, text: string})
console.log(data.textPerPage);
});
License
MIT licensed and all it's dependencies are MIT or BSD licensed.