npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

relexer

v1.0.1

Published

A regular expression based lexer for javascript and node.js streams

Downloads

1

Readme

Relexer

Build Status

Regular Expression based lexing of Node.js streams

Relexer on GitHub

Relexer will lex a Node.js readable stream with user-controllable buffering. This allows large files to be analyzed without buffering the entire file in memory.

Quick Start

Below is a typescript example lexer to generate tokens that are words separated by whitespace

//example.ts
import * as relexer from 'relexer';

let tokens: string[] = [];

const rules: relexer.Rules = [
    //Match one-or-more non-whitepsace characters
    { re: '[^\\s]+', action: async (match, pos) => { tokens.push(match); } },  
    //Ignore whitespace
    { re: '\\s+', action: async (match, pos) => { }; } 
];

const lexer = relexer.create(rules);

lexer.lex(process.stdin).then(() => {
  console.log(tokens);
});

Rules

In the example above, rules are an array of regular expression (re) strings and actions (action) objects. (Note that RegExp objects will not work since RegExp.toString() encloses the RegExp in slashes, e.g., /xxx/))

The action objects are functions that return void promises that are resolved when the action is complete. relexer will not process the next token until the returned promise is fulfilled. In the meantime the input stream is paused. An action rejecting the promise will cause lex (see below) to reject its promise with the same error. In the example, async lambda functions serve as a concise way to write actions.

Lex

The lex method begins lexing based on the rules passed to relexer.create. It returns a promise that is resolved when the stream ends and all tokens are processed. The promise is rejected if no rules match, there is an I/O error, or an action rejects its promise.

When this program is run with

$ echo "Hello, how are you today?" | node example.js

it should produce

[ 'Hello,', 'how', 'are', 'you', 'today?']

Buffering

Relexer works by constructing a large regular expression and using the native javascript RegExp package to process it. Therefore, relexer has to work within the restrictions of that library. In particular, there is no way to tell if a prefix of a string can never match a regular expression. Thus, perfect matching may require buffering the entire stream contents, which is clearly not acceptable in many cases involving large streams.

As a workaround, relexer limits the size of buffer it will aggregate in order to match a token to the rule set. By default relexer will buffer 1024 bytes. But this can be changed using the aggregateUntil option to lex. E.g.,

lexer.lex(process.stdin, { aggregateUntil: 10 });

In this case, no more than 10 bytes will be buffered, and so any tokens longer than 10 characters may not come back as a match. However, because the stream can return chunks larger than the aggregateUntil limit, there may be cases were relexer can return larger tokens.

To continue the quick start example with an aggregateUntil limit of 10, if the stream returns data 1 byte at a time, the lexer will not match the token 01234567890 because it is 11 characters long. Instead, lex will reject its promise with a No rules matched exception. However, if the stream returns the token in an 11 or more byte chunk, relexer will match 01234567890.

Generally you should set aggregateUntil to a value large enough for the largest expected token size, but small enough that no issues will arise if relexer has to fill a buffer of that size.

The default aggregateUntil limit is 1024 bytes which should suffice for most applications.

More information

If you have additional documentation to contribute, please submit a pull request. As time and contributions permit more documentation will be added. In the meantime, the unit tests in the test directory in the GitHub repository has more examples.