chrome-lens-ocr
v4.0.3
Published
Package to use Google Lens OCR for free, via API used in Chrome.
Downloads
233
Maintainers
Readme
Chrome Lens OCR
Library to use Google Lens OCR for free, via API used in Chromium. This doesn't require running a headless browser, and is much faster than using Puppeteer or similar. It's set up to work without any options, there's no need to be authorized (no need for Google account!).
Installation
npm install chrome-lens-ocr
Usage
import Lens from 'chrome-lens-ocr';
import { inspect } from 'util';
const lens = new Lens();
const log = data => console.log(inspect(data, { depth: null, colors: true }));
lens.scanByFile('shrimple.png').then(log).catch(console.error);
lens.scanByBuffer(Buffer.from('...')).then(log).catch(console.error);
// use image url to scan quicker, but pixel coordinates will be 0
lens.scanByURL('https://lune.dimden.dev/7949f833fa42.png').then(log).catch(console.error);
All methods above return LensResult
object (see docs below). In case error happened during the process, LensError
will be thrown.
API
All of the classes are exported. Lens
is the default export, and LensCore
, LensResult
, Segment
, BoundingBox
and LensError
are named exports.
class Lens extends LensCore
constructor(options?: Object): Lens
Creates a new instance of Lens. options
is optional.
scanByFile(path: String): Promise<LensResult>
Scans an image from a file.
scanByBuffer(buffer: Buffer): Promise<LensResult>
Scans an image from a buffer.
class LensCore
This is the core class, which is extended by Lens
. You can use it if you want to use the library in environments that don't support Node.js APIs, as it doesn't include scanByFile
and scanByBuffer
methods. Keep in mind that Lens
class extends LensCore
, so all methods and properties of LensCore
are available in Lens
.
constructor(options?: Object, fetch?: Function): LensCore
Creates a new instance of LensCore. options
is optional. fetch
is function that will be used to send requests, by default it's fetch
from global scope.
scanByURL(url: String, dimensions?: [Number, Number] = [0, 0]): Promise<LensResult>
Scans an image from a remote URL, supports large image resolution. You may provide an optional image dimensions array, or else text segment coordinates (result.segments[].pixelCoords
) from this method will always return 0.
scanByData(data: Uint8Array, mime: String, originalDimensions: Array): Promise<LensResult>
Scans an image from a Uint8Array. originalDimensions
is array of [width, height]
format. You must provide width and height of image before it was resized to get accurate pixel coordinates. You should only use this method if you're using the library in environments that don't support Node.js APIs, because it doesn't automatically resize images to less than 1000x1000 dimensions, like methods in Lens
do.
updateOptions(options: Object): void
Updates the options for the instance.
fetch(options?: RequestInit & { endpoint: String } = {}, originalDimensions: Array): Promise<LensResult>
Internal method to send a request to the API. You can use it to send a custom request, but you'll have to handle the formdata and dimensions yourself. Original dimensions ([width, height]
) are used to calculate pixel coordinates of the text. You should supply dimensions before any resizing (hence 'original') if you want to get correct coordinates for original image.
cookies
This property contains object with cookies that are set for the instance. You can use it to save and load cookies to avoid doing the consent process every time.
Options object
Options can be empty, or contain the following (default values):
{
chromeVersion: '124.0.6367.60', // Version of Chromium to "use"
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36', // user agent to use, major Chrome version should match the previous value
headers: {}, // you can add headers here, they'll override the default ones
fetchOptions: {}, // options to pass to fetch function (like agent, dispatcher, etc.)
}
class LensResult
Instance of this class is is returned by all scan methods. It contains the following properties:
{
language: String, // language of the text in 2-letter format
segments: Array<Segment>
}
class Segment
Instance of this class is contained in LensResult
's segments
property. It contains the following properties:
{
text: String, // text of the segment
boundingBox: BoundingBox,
}
class BoundingBox
Instance of this class is contained in Segment
's boundingBox
property. It contains the following properties:
{
centerPerX: Number, // center of the bounding box on X axis, in % of the image width
centerPerY: Number, // center of the bounding box on Y axis, in % of the image height
perWidth: Number, // width of the bounding box, in % of the image width
perHeight: Number, // height of the bounding box, in % of the image height
pixelCoords: {
x: Number, // top-left corner X coordinate, in pixels
y: Number, // top-left corner Y coordinate, in pixels
width: Number, // width of the bounding box, in pixels
height: Number, // height of the bounding box, in pixels
}
}
class LensError extends Error
Instance of this class is thrown when an error happens during the process. It contains the following properties:
{
name: "LensError"
message: String, // error message
code: String, // error code
headers: HeadersList, // headers of the response
body: String, // body of the response
}
Using proxy
By default, this library uses undici's fetch to make requests. You can use undici dispatcher to proxy requests. Here's an example:
import Lens from 'chrome-lens-ocr';
import { ProxyAgent } from 'undici';
const lens = new Lens({
fetchOptions: {
dispatcher: new ProxyAgent('http://user:[email protected]:8080')
}
});
If you use core class with different fetch function, you can pass different options instead of dispatcher
in fetchOptions
(for example agent
for node-fetch).
Using your cookies
You can use your own cookies to be authorized in Google. This is optional. Here's an example:
import Lens from 'chrome-lens-ocr';
const lens = new Lens({
headers: {
// 'cookie' is the only 'special' header that can also accept an object, all other headers should be strings
'cookie': '__Secure-ENID=17.SE=-dizH-; NID=511=---bcDwC4fo0--lgfi0n2-' // way #1
'cookie': { // way #2, better because you can set expiration date and it will be automatically handled, all 3 fields are required in this way
'__Secure-ENID': {
name: '__Secure-ENID',
value: '17.SE=-dizH-',
expires: 1634025600,
},
'NID': {
name: 'NID',
value: '511=---bcDwC4fo0--lgfi0n2-',
expires: 1634025600,
}
}
}
});
Using in other environments
You can use this library in environments that don't support Node.js APIs by importing only the core, which doesn't include scanByFile
and scanByBuffer
methods. Instead, it has scanByData
method, which accepts a Uint8Array
, mime type and original image dimensions. Here's an example:
import LensCore from 'chrome-lens-ocr/src/core.js';
const lens = new LensCore();
lens.scanByData(new Uint8Array([41, 40, 236, 244, 151, 101, 118, 16, 37, 138, 199, 229, 2, 75, 33]) 'image/png', [1280, 720]);
But in this case, you'll need to handle resizing images to less than 1000x1000 dimensions yourself, as larger images aren't supported by Google Lens.
Additional information
In some of the EU countries, using any Google services requires cookie consent. This library handles it automatically, but it's pretty slow on first scan of the instance. So if you make a lot of new instances or always need it to be fast on first launch, you need to save cookies somewhere to avoid this. There's an example of how to do it in cli.js.
Custom Sharex OCR
It's possible to use this package with Sharex to OCR images using Google Lens API, instead of bad default OCR in Sharex. Please refer to SHAREX.md for instructions.
CLI Usage
You may install this package globally by adding -g on install:
npm install -g chrome-lens-ocr
Doing this will allow you to use the OCR from a terminal.
Usage: chrome-lens-ocr [-d] ./path/to/image.png
-d Do not copy text to clipboard
Example:
chrome-lens-ocr ./shrimple.png
chrome-lens-ocr -d ./shrimple.png
chrome-lens-ocr -d https://lune.dimden.dev/7949f833fa42.png