npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

lexington

v1.0.3

Published

A simple yet flexible lexer (or tokenizer).

Downloads

4

Readme

lexington Build Status

Lexington is a simple library to transform some text into a flat list of tokens. Works in node and in the browser.

Installation

npm install lexington

Example Usage


var Lexington = require('Lexington');

var lexer = new Lexington('Synergy lol wut hahahaa', (stream, state) => {

    if (stream.eat('synergy|cloud|greenshoots', 'i')){
        return ['buzzword'];
    }

    if (stream.eat('[ha]{2,}([^\\w]|$)', 'i')){
        return ['laughter'];
    }

    if (stream.eat('[a-zA-Z0-9]+')){
        return ['word'];
    }

    if (stream.eat('[\\s]+')){
        return ['whitespace'];
    }

    return ['other']
});

console.log(lexer.getTokens());

Will output:

[
    { text: 'Synergy', types: [ 'buzzword' ],   startIndex: 0,  endIndex: 6  },
    { text: ' ',       types: [ 'whitespace' ], startIndex: 7,  endIndex: 7  },
    { text: 'lol',     types: [ 'word' ],       startIndex: 8,  endIndex: 10 },
    { text: ' ',       types: [ 'whitespace' ], startIndex: 11, endIndex: 11 },
    { text: 'wut',     types: [ 'word' ],       startIndex: 12, endIndex: 14 },
    { text: ' ',       types: [ 'whitespace' ], startIndex: 15, endIndex: 15 },
    { text: 'hahahaa', types: [ 'laughter' ],   startIndex: 16, endIndex: 22 }
]

Documentation

To lex some text use you have to create a new instance of Lexington and then get the tokens back out.

var lexer = new Lexington('string to lex', function(stream, state){
    // Your code here.
});

// This will spit out your tokens:
console.log(lexer.getTokens()); 

The first argument is the text that you would like to lex.

The second argument is the function that defines how you would like to lex some text. :loudspeaker: Every call of you lexer function you have to return an array which represents that token. This will move the stream to the next unlexed character and call your lexing function again.

Your lexer function will always called with stream and state.

The state argument is a simple object who's state is kept across function calls. For example if in a call of your lexer function you were to do: state.foo = true then the next call you would be able to access state.foo. This is very much inspired by the codemirror.

The so called 'stream' has the following methods: stream.current(): Returns the text which is the current lexer functions 'scope' if you like.

stream.match(regex, flags): Given a regex string i.e '[a-z]+' and any regex flags (like the RegExp constructor) returns true or false if the next characters in the stream match.

stream.eat(regex, flags): The same as stream.match() but instead 'consumes' anything that matched. See the example above... kind of hard to explain.

In the browser

If you would like to use this in the browser just use /build/lexington.js. You can use the module loader of your choice or can access it globally as Lexington.

TODO

  • If the user defined lexer function doesn't return anything Lexington currently recurses endlessly. Might be nice to add a check in here and throw an error instead.
  • If you want to use regex character sets such as /[\w]+/ in stream.match() or stream.eat() you have to escape the backslash by doing, for example, stream.match('[\\w]+'); This is due to how RegExp() works :poop:.