npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ocr-space-api-alt2

v2.0.1

Published

Fork of Dennnisk's original ocr-space-api with added support for PDF uploading. Provides an easy way to send images to the ocr.space API and get the OCR Result

Downloads

33

Readme

ocr-space-api-alt2

Allow to access ORC.SPACE API to send images and get the embedded text.

More Details: https://ocr.space/ocrapi.

IMPORTANT:

The OCR is provided by OCR.SPACE. I don't have anything with them, I just want to help reworking and sharing this library.

Main changes

  1. The original library was using request, but since it's deprecated, I saw the necessity to migrate from it. Now I'm currently using axios to perform the request.

  2. Since axios doesn't support form data request, I've used query-string.

Installation

First - Register and Get your API key

Get you API key at this link. Just follow their steps.

Second - Install npm package

  npm i ocr-space-api-alt2
  yarn add ocr-space-api-alt2

Usage example

You can see and example at the folder example.

const ocrSpaceApi = require('ocr-space-api-alt2')

const options =  { 
  apikey: '<YOUR API KEY HERE>',
  filetype: 'png',
  verbose: true,
  url: `${__dirname}/loveText.jpg`
}

const getText = async () => {
  try {
    const result = await ocrSpaceApi(options)

    console.log({ result })
  } catch (error) {
    console.error(error)
  }
}

getText()

Options

The available options are an adaptation from the docs.

| key | Value | Description | |--|--|--| | apikey | [Required] - String API key from OCR space API| Get you API key at this link. | | url | [Required] - String | Url that points to file you want to get its text from. It can be a url (starting in http), a base64 image or a local file | | language | [Optional] - String Arabic=ara Bulgarian=bul Chinese(Simplified)=chs Chinese(Traditional)=cht Croatian = hrv Czech = cze Danish = dan Dutch = dut English = eng Finnish = fin French = fre German = ger Greek = gre Hungarian = hun Korean = kor Italian = ita Japanese = jpn Polish = pol Portuguese = por Russian = rus Slovenian = slv Spanish = spa Swedish = swe Turkish = tur | Language used for OCR. If no language is specified, English eng is taken as default. IMPORTANT: The language code has always 3-letters (not 2). So it is "eng" and not "en". Engine2 has automatic Western language detection, so this value will be ignored. | | isOverlayRequired | [Optional] - Boolean | Default = False If true, returns the coordinates of the bounding boxes for each word. If false, the OCR'ed text is returned only as a text block (this makes the JSON reponse smaller). Overlay data can be used, for example, to show text over the image. | | filetype | [Optional] - String Available values: PDF, GIF, PNG, JPG, TIF, BMP | Overwrites the automatic file type detection based on content-type. Supported image file formats are png, jpg (jpeg), gif, tif (tiff) and bmp. For document ocr, the api supports the Adobe PDF format. Multi-page TIFF files are supported. | | detectOrientation | [Optional] - Boolean | If set to true, the api autorotates the image correctly and sets the TextOrientation parameter in the JSON response. If the image is not rotated, then TextOrientation=0, otherwise it is the degree of the rotation, e. g. "270". | | isCreateSearchablePdf | [Optional] - Boolean | Default = False If true, API generates a searchable PDF. This parameter automatically sets isOverlayRequired = true. | | isSearchablePdfHideTextLayer | [Optional] - Boolean | Default = False. If true, the text layer is hidden (not visible). | | scale | [Optional] - Boolean | Default = False. If set to true, the api does some internal upscaling. This can improve the OCR result significantly, especially for low-resolution PDF scans. Note that the front page demo uses scale=true, but the API uses scale=false by default. See also this OCR forum post. | | isTable | [Optional] - Boolean | If set to true, the OCR logic makes sure that the parsed text result is always returned line by line. This switch is recommended for table OCR, receipt OCR, invoice processing and all other type of input documents that have a table like structure. | | OCREngine | [Optional] - Number Available values: 1, 2 | Engine 1 is default. See OCR Engines. | | verbose | [Optional] - Boolean | Wether or not you want the full response from de OCR API or just the text that was gotten. |

Authors

  • Denis - Initial Work - Initial Documentation - dennnisk.
  • Anthony Luzquiños - Rework - AnthonyLzq.

Important

This package was not fully tested, and every contribution will be appreciated.