npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@ascari/reco

v1.0.1

Published

Generic text classifier for Mexican electronic invoices.

Downloads

11

Readme

reco

A text recognition engine that classifies concept descriptions found in electronic invoices used in Mexico.

Installation

Command Line Utility

You must install reco globally to use the command line interface.

npm i @ascari/reco -g

Module

npm i @ascari/reco --save

USAGE

Command Line Utility

Create reco.json
reco init

A ./reco.json file will be created with default values. By default it uses a sqlite3 database.

You may edit the configuration now.

Scaffold a new project

Requires a valid ./reco.json file to be present.

reco create

A database will be scaffolded in the current directory.

By default it creates a ./database folder where a sqlite3 database file will be stored.

You may edit the first migration: ./database/migrations/0.js to better accomodate your database structure. Keep in mind that the autogenerated tables and columns are required.


NOTE The following commands can only be called after creating a project.


Add a single invoice

Load xml, parse information and store unique: suppliers, clients & concepts found.

reco xml path/to/invoice.xml

Note Ideally, valid SAT invoices should be fed to reco, however reco does not verify its integrity, this means you can feed non-compliant xml invoices as well, as long as they follow a similar structure:

<?xml version="1.0" encoding="utf-8"?>
<Comprobante fecha="{{INVOICE_DATE}}" sello="{{SELLO_DIGITAL}}">
  <Emisor rfc="{{CLIENT_RFC}}" name="{{CLIENT_NAME}}" />
  <Receptor rfc="{{SUPPLIER_RFC}}" name="{{SUPPLIER_NAME}}" />
  <Conceptos>
    <Concepto descripcion="{{CONCEPT_DESCRIPTION_A}}" />
    <Concepto descripcion="{{CONCEPT_DESCRIPTION_B}}" />
  </Conceptos>
</Comprobante>
Add all invoices in a folder

Load and store all invoice files found in a folder.

reco xmls path/to/invoices

Currenly, reco cannot read folders recursively

Label a concept for training

Will create a label that is used to train a classifier.

reco label "LABEL" "CONCEPT"

You may use the --rfc option to scope a label to a supplier.

reco label "LABEL" "CONCEPT" --rfc XXX0123456X7

Scoping a label improves recognition accuracy. The imporovment comes from weighting higher classifications that belong to a supplier when a recognition test is also scoped to a supplier. The reasoning being that a supplier will generally have their own unique set of concepts for their products and or services, that will more likely match a label scoped to the same supplier.

In other words, a Pizza supplier will tend to better identify concepts with the word "pizza", since its products have the word "pizza" in them, when we are classifying a concept from the Pizza supplier. Otherwise, a Toy supplier with toy pizza games may rank higher.

One more time: When we know a invoice concept comes from a certain supplier, it is better to test it against a classifier that has only been trained on its own invoice concepts and labels from the same supplier.

Add all labels found in a file

Add labels found in a list.

reco labels path/to/labels.lst

Labels are seperated by new lines, where labels and concept are seperated by a (:) colon.

example:

apple:I WANT AN APPLE
orange:ORANGE YOU GLAD?
lemon:EAT SOME LEMON PIE

You may use the -v option to see progress.

You may use the --rfc option to scope labels to a supplier.

You may use the --delim option to specify a different delimeter.

You mau use the --no-delim option to specify that the list is not delimeted, that is, it does not have a label and a concept, instead the label and the concept are the same.

This is usefull for adding a supplier's catalog, when identifying their concepts.

Train classifiers

Train classifiers.

Be patient, it may take a while.

reco train
Test an arbitrary concept

Test recognition by classifying a specified concept. Will return label with the best score.

reco test "CONCEPT"

You may use the --rfc option to scope test to a supplier.

You may use the -v option to return classification information. 10 rows are returned by default, if you specify a number: -v 20, you can specify how many classification rows to return, ordered from best match to least.

Module

const Reco = require('reco');

// Reco configuration
const recoConfig = { .... };

// instanciate
const reco = new Reco(recoConfig);

API

Reco::contructor(recoConfig);

Where recoConfig can have the follwing options:

Note the database property is fed to knex.

{
  database: {
    client: 'sqlite3',
    connection: {
      filename: './database/database.sqlite'
    },
    migrations: {
      tableName: 'migrations',
      directory: './database/migrations',
      stub: './database/stub.migration.js'
    },
    seeds: {
      directory: './database/seeds',
    },
    useNullAsDefault: true,
  },
}

Promise Reco::addLabel(String label, String concept, String [supplierRfc=null]);

Add a label.

Promise Reco::addXmlInvoice(String xml);

Add an invoice.

Promise Reco::train();

Train classifiers.

Promise Reco::test(String input, String [supplierRfc=null]);

Test classifiers

License

See LICENSE in respository.