aws-lambda-tesseract-french
v1.0.0
Published
Tesseract 4.1 (with French training data) to fit inside AWS Lambda, forked from shelf.io's work
Downloads
2
Readme
aws-lambda-tesseract
Tesseract 5.1 (with French training data) to fit inside AWS Lambda
Forked from https://github.com/shelfio/aws-lambda-tesseract, all the credits go to shelf.io, I just compiled Tesseract 5.1 for french language, changed the params passed to the cli and published it !
Inspired by chrome-aws-lambda & lambda-scanner-ocr
Install
$ yarn add aws-lambda-tesseract-french
Works for Node 16.x runtime and compiled with Tesseract 5.1.0. It works with x86_64 CPUs for now only.
How does it work?
This package contains an archive with Tesseract 5.1 compiled for usage in AWS Lambda environment.
When a Lambda starts, it unpacks an archive with a binary to the /tmp
folder and makes sure it's done only once per Lambda cold start.
Usage
const {getTextFromImage, isSupportedFile} = require('aws-lambda-tesseract-french');
module.exports.handler = async event => {
// assuming there is a photo.jpg inside /tmp dir
// original file will be deleted afterwards
if (!isSupportedFile('/tmp/photo.jpg')) {
return false;
}
return getTextFromImage('/tmp/photo.jpg');
};
isSupportedFile
checks that file has image-like file extension and it's not in the list of
unsupported by Tesseract file extensions.
Compile It Yourself
Smoke test that it works by running test.sh
script
See Also
Publish
$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags
License
MIT © Shelf