npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

atok-parser

v0.4.4

Published

Parser generator based on the atok tokenizer

Downloads

68

Readme

Parser builder

Synopsis

Writing parsers is quite a common but sometimes lengthy task. To ease this process atok-parser leverages the atok tokenizer and performs the basic steps to set up a streaming parser, such as:

  • Automatically instantiate a tokenizer with provided options
  • Provide a mechanism to locate an error in the input data
    • track([Boolean]): keep track of the line and column positions to be used when building errors. Note that when set, tracking incurs a performance penalty.
  • Proxy basic node.js streaming methods: write(), end(), pause() and resume()
  • Proxy basic node.js streaming events (note that [data] and [end] are not automatically proxied) and some of atok
    • [drain]
    • [debug]
  • Provide preset variables within the parser constructor
    • atok {Object}: atok tokenizer instance
    • self {Object}: this
  • Provide helpers that simplify parsing rules (see below for description)
    • whitespace()
    • number()
    • float()
    • word()
    • string()
    • utf8()
    • chunk()
    • stringList()
    • match()
    • noop()
    • wait()

Download

It is published on node package manager (npm). To install, do:

npm install atok-parser

Usage

A silly example to illustrate the various pre defined variables and parser definition. It parses a flot number and returns the value via its #parse method.

function myParser (options) {
	function handler (num) {
		// The options are set from the myParser function parameters
		// self is already set to the Parser instance
		if ( options.check && !isFinite(num) )
			return self.emit('error', new Error('Invalid float: ' + num))

		self.emit('data', num)
	}
	// the float() and whitespace() helpers are provided by atok-parser
	atok.float(handler)
	atok.whitespace()
}

var Parser = require('..').createParser(myParser)

// Add the #parse() method to the Parser
Parser.prototype.parse = function (data) {
	var res

	// One (silly) way to make parse() look synchronous...
	this.once('data', function (data) {
		res = data
	})
	this.write(data)

	// ...write() is synchronous
	return res
}

// Instantiate a parser
var p = new Parser({ check: true })

// Parse a valid float
var validfloat = p.parse('123.456 ')
console.log('parsed data is of type', typeof validfloat, 'value', validfloat)

// The following data will produce an invalid float and an error
p.on('error', console.error)
var invalidfloat = p.parse('123.456e1234 ')

Methods

  • createParserFromFile(file[, parserOptions, parserEvents, atokOptions]): return a parser class (Function) based on the input file.

    • file {String}: file to read the parser from (.js extension is optional)
    • parserOptions {String}: coma separated list of parser options
    • parserEvents {Object}: events emitted by the parser with their arguments count
    • atokOptions {Object}: tokenizer options

    The following variables are made available to the parser javascript code:

    • atok {_Object_}: atok tokenizer instanciated with provided options. Also set as this.atok DO NOT DELETE
    • self {_Object_}: reference to this

    Predefined methods:

    • write(data)
    • end([data])
    • pause()
    • resume()
    • debug([logger (_Function_)])
    • track(flag (_Boolean_))

    Events automatically forwarded from tokenizer to parser:

    • drain
    • debug
  • createParser(data[, parserOptions, parserEvents, atokOptions]): same as createParserFromFile() but with supplied content instead of a file name

    • data {String | Array | Function}: the content to be used, can also be an array of strings or a function. If a function, its parameters are used as parser options unless parserOptions is set

Helpers

Helpers are a set of standard Atok rules organized to match a specific type of data. If the data is encountered, the handler is fired with the results. If not, the rule is ignored. The behaviour of a single helper is the same as a single Atok rule:

  • go to the next rule if no match, unless continue(jump, jumpOnFail) was applied to the helper
  • go back to the first rule of the rule set upon match, unless continue(jump) was applied to the helper
  • next rule set can be set using next(ruleSetId)
  • rules can be jumped around by using continue(jump, jumpOnFail). A helper has exactly the size of a single rule, which greatly helps defining complex rules.
// Parse a whitespace separated list of floats
var myParser = [
	'atok.float(function (n) { self.emit("data", n) })'
,	'atok.continue(-1, -2)'
,	'atok.whitespace()'
]

var Parser = require('atok-parser').createParser(myParser)
var p = new Parser

p.on('data', function (num) {
	console.log(typeof num, num)
})
p.end('0.133  0.255')

Arguments are not required. If no handler is specified, the [data] event will be emitted with the corresponding data.

  • whitespace(handler): ignore consecutive spaces, tabs, line breaks.
    • handler(whitespace)
  • number(handler): process positive integers
    • handler(num)
  • float(handler): process float numbers. NB. the result can be an invalid float (NaN or Infinity).
    • handler(floatNumber)
  • word(handler): process a word containing letters, digits and underscores
    • handler(word)
  • string([start, end, esc,] handler): process a delimited string. If end is not supplied, it is set to start.
    • start {String}: starting pattern (default=")
    • end {String}: ending pattern (default=")
    • esc {String}: escape character (default=)
    • handler(string)
  • utf8([start, end,] handler): process a delimited string containing UTF-8 encoded characters. If end is not supplied, it is set to start.
    • start {String}: starting pattern (default=")
    • end {String}: ending pattern (default=")
    • handler(UTF-8String)
  • chunk(charSet, handler):
    • charSet {Object}: object defining the charsets to be used as matching characters e.g. { start: 'aA', end 'zZ' } matches all letters
    • handler(chunk)
  • stringList([start, end, separator,] handler): process a delimited list of strings
    • start {String}: starting pattern (default=()
    • end {String}: ending pattern (default=))
    • separator {String}: separator character (default=,)
    • handler(listOfStrings)
  • match(start, end, stringQuotes, handler): find a matching pattern (e.g. bracket matching), skipping string content if required
    • start {String}: starting pattern to look for
    • end {String}: ending pattern to look for
    • stringQuotes {Array}: array of string delimiters (default=['"', "'"]). Use an empty array to disable string content processing
    • handler(token)
  • noop(next): passthrough - does not do anything except applying given properties (useful to branch rules without having to use atok#saveRuleSet() and atok#loadRuleSet())
    • next {String}: next ruleset to load
  • wait(atokPattern[...atokPattern], handler): wait for the given pattern. Nothing happens until data is received that triggers the pattern. Must be preceded by continue() to properly work. Typical usage is when expecting a string the starting quote is received but not the end... so wait until then and resume the rules workflow.
  • nvp([nameCharSet, separator, endPattern] handler): parse a named value pair (default nameCharSet={ start: 'aA0_', end: 'zZ9_' }, separator==, endPattern={ firstOf: ' \t\n\r' }). Disable endPattern by setting it to '' or [].
    • handler(name, value)

Examples

A set of examples are located under the examples/ directory.