aws-textract-client
v1.1.1
Published
A comprehensive helper library for AWS Textract, including S3, SNS, and SQS.
Downloads
24
Readme
AWS Textract Client
A helpful class to perform server-sided AWS Textract actions.
Install
npm i aws-textract-client
Features
- upload documents to S3 bucket
- create and delete SNS Topics
- create and delete SQS Queues and subscribe to SNS topic (required for async multi-page processing)
- various AWS Textract actions
- Analyze document
- Detect document
- Analyze invoices
- process & simplify results
Example
Create SNS Topic and SQS Queue
Topics and Queue for Textract should start with AmazonTextract
!
import { AWSTextractClient } from "aws-textracr-client";
// config contains AWS credentials settings and is used for all client instances
const textractClient = new AWSTextractClient(config);
// create topic and queue once
const topicArn = await textractClient.createTopic("AmazonTextractMyTopic");
const queueUrl = await textractClient.createQueue(
"AmazonTextractMyQueue",
topicArn
);
Process Invoce
import { AWSTextractClient } from "aws-textracr-client";
// config contains AWS credentials settings and is used for all client instances
const textractClient = new AWSTextractClient(config);
// set role and topic of notification channel
textractClient.setNotificationChannel(roleArn, topicArn);
// upload document to S3 bucket
await textractClient.uploadDocument(
S3_BUCKET,
file,
createReadStream(pathToYourFile)
);
// process document and get results
const results = await textractClient.processDocument(
AWSTextractClient.TYPE_EXPENSE,
S3_BUCKET,
file,
queueUrl
);
Results
All results contain up to 3 confidence values, usually for the label and the value and additionally for the pre-trained invoice field mapping.
Additionally you can get the position of the text on the document in form of boundary boxes.
Example response:
{
"$metadata": {
"httpStatusCode": 200,
"requestId": "d15315f4-239b-40d8-87f2-8e2731a31e4c",
"attempts": 1,
"totalRetryDelay": 0
},
"AnalyzeExpenseModelVersion": "1.0",
"DocumentMetadata": {
"Pages": 1
},
"ExpenseDocuments": [
{
"Blocks": [...],
"ExpenseIndex": 1,
"LineItemGroups": [...],
"SummaryFields": [...]
}
],
"JobStatus": "SUCCEEDED"
}
The field JobStatus
is only present in multi-page processes.