npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ret

v0.5.0

Published

Tokenizes a string that represents a regular expression.

Downloads

81,269,840

Readme

Regular Expression Tokenizer

Tokenizes strings that represent a regular expressions.

Depfu codecov

Usage

const ret = require('ret');

let tokens = ret(/foo|bar/.source);

tokens will contain the following object

{
  "type": ret.types.ROOT
  "options": [
    [ { "type": ret.types.CHAR, "value", 102 },
      { "type": ret.types.CHAR, "value", 111 },
      { "type": ret.types.CHAR, "value", 111 } ],
    [ { "type": ret.types.CHAR, "value",  98 },
      { "type": ret.types.CHAR, "value",  97 },
      { "type": ret.types.CHAR, "value", 114 } ]
  ]
}

Reconstructing Regular Expressions from Tokens

The reconstruct function accepts an any token and returns, as a string, the component of the regular expression that is associated with that token.

import { reconstruct, types } from 'ret'
const tokens = ret(/foo|bar/.source)
const setToken = {
    "type": types.SET,
    "set": [
      { "type": types.CHAR, "value": 97 },
      { "type": types.CHAR, "value": 98 },
      { "type": types.CHAR, "value": 99 }
    ],
    "not": true
  }
reconstruct(tokens)                               // 'foo|bar'
reconstruct({ "type": types.CHAR, "value": 102 }) // 'f'
reconstruct(setToken)                             // '^abc'

Token Types

ret.types is a collection of the various token types exported by ret.

ROOT

Only used in the root of the regexp. This is needed due to the posibility of the root containing a pipe | character. In that case, the token will have an options key that will be an array of arrays of tokens. If not, it will contain a stack key that is an array of tokens.

{
  "type": ret.types.ROOT,
  "stack": [token1, token2...],
}
{
  "type": ret.types.ROOT,
  "options" [
    [token1, token2...],
    [othertoken1, othertoken2...]
    ...
  ],
}

GROUP

Groups contain tokens that are inside of a parenthesis. If the group begins with ? followed by another character, it's a special type of group. A ':' tells the group not to be remembered when exec is used. '=' means the previous token matches only if followed by this group, and '!' means the previous token matches only if NOT followed.

Like root, it can contain an options key instead of stack if there is a pipe.

{
  "type": ret.types.GROUP,
  "remember" true,
  "followedBy": false,
  "notFollowedBy": false,
  "stack": [token1, token2...],
}
{
  "type": ret.types.GROUP,
  "remember" true,
  "followedBy": false,
  "notFollowedBy": false,
  "options" [
    [token1, token2...],
    [othertoken1, othertoken2...]
    ...
  ],
}

POSITION

\b, \B, ^, and $ specify positions in the regexp.

{
  "type": ret.types.POSITION,
  "value": "^",
}

SET

Contains a key set specifying what tokens are allowed and a key not specifying if the set should be negated. A set can contain other sets, ranges, and characters.

{
  "type": ret.types.SET,
  "set": [token1, token2...],
  "not": false,
}

RANGE

Used in set tokens to specify a character range. from and to are character codes.

{
  "type": ret.types.RANGE,
  "from": 97,
  "to": 122,
}

REPETITION

{
  "type": ret.types.REPETITION,
  "min": 0,
  "max": Infinity,
  "value": token,
}

REFERENCE

References a group token. value is 1-9.

{
  "type": ret.types.REFERENCE,
  "value": 1,
}

CHAR

Represents a single character token. value is the character code. This might seem a bit cluttering instead of concatenating characters together. But since repetition tokens only repeat the last token and not the last clause like the pipe, it's simpler to do it this way.

{
  "type": ret.types.CHAR,
  "value": 123,
}

Errors

ret.js will throw errors if given a string with an invalid regular expression. All possible errors are

  • Invalid group. When a group with an immediate ? character is followed by an invalid character. It can only be followed by !, =, or :. Example: /(?_abc)/
  • Nothing to repeat. Thrown when a repetitional token is used as the first token in the current clause, as in right in the beginning of the regexp or group, or right after a pipe. Example: /foo|?bar/, /{1,3}foo|bar/, /foo(+bar)/
  • Unmatched ). A group was not opened, but was closed. Example: /hello)2u/
  • Unterminated group. A group was not closed. Example: /(1(23)4/
  • Unterminated character class. A custom character set was not closed. Example: /[abc/

Regular Expression Syntax

Regular expressions follow the JavaScript syntax.

The following latest JavaScript additions are not supported yet:

Examples

/abc/

{
  "type": ret.types.ROOT,
  "stack": [
    { "type": ret.types.CHAR, "value": 97 },
    { "type": ret.types.CHAR, "value": 98 },
    { "type": ret.types.CHAR, "value": 99 }
  ]
}

/[abc]/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.SET,
    "set": [
      { "type": ret.types.CHAR, "value": 97 },
      { "type": ret.types.CHAR, "value": 98 },
      { "type": ret.types.CHAR, "value": 99 }
    ],
    "not": false
  }]
}

/[^abc]/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.SET,
    "set": [
      { "type": ret.types.CHAR, "value": 97 },
      { "type": ret.types.CHAR, "value": 98 },
      { "type": ret.types.CHAR, "value": 99 }
    ],
    "not": true
  }]
}

/[a-z]/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.SET,
    "set": [
      { "type": ret.types.RANGE, "from": 97, "to": 122 }
    ],
    "not": false
  }]
}

/\w/

// Similar logic for `\W`, `\d`, `\D`, `\s` and `\S`    
{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.SET,
    "set": [{
      { "type": ret.types.CHAR, "value": 95 },
      { "type": ret.types.RANGE, "from": 97, "to": 122 },
      { "type": ret.types.RANGE, "from": 65, "to": 90 },
      { "type": ret.types.RANGE, "from": 48, "to": 57 }
    }],
    "not": false
  }]
}

/./

// any character but CR, LF, U+2028 or U+2029
{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.SET,
    "set": [ 
      { "type": ret.types.CHAR, "value": 10 },
      { "type": ret.types.CHAR, "value": 13 },
      { "type": ret.types.CHAR, "value": 8232 },
      { "type": ret.types.CHAR, "value": 8233 }
    ],
    "not": true
  }]
}

/a*/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 0,
    "max": Infinity,
    "value": { "type": ret.types.CHAR, "value": 97 }
  }]
}

/a+/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 1,
    "max": Infinity,
    "value": { "type": ret.types.CHAR, "value": 97 },
  }]
}

/a?/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 0,
    "max": 1,
    "value": { "type": ret.types.CHAR, "value": 97 }
  }]
}

/a{3}/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 3,
    "max": 3,
    "value": { "type": ret.types.CHAR, "value": 97 }
  }]
}

/a{3,5}/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 3,
    "max": 5,
    "value": { "type": ret.types.CHAR, "value": 97 }
  }]
}

/a{3,}/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.REPETITION, 
    "min": 3,
    "max": Infinity,
    "value": { "type": ret.types.CHAR, "value": 97 }
  }]
}

/(a)/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.GROUP, 
    "stack": { "type": ret.types.CHAR, "value": 97 },
    "remember": true
  }]
}

/(?:a)/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.GROUP, 
    "stack": { "type": ret.types.CHAR, "value": 97 },
    "remember": false
  }]
}

/(?=a)/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.GROUP, 
    "stack": { "type": ret.types.CHAR, "value": 97 },
    "remember": false,
    "followedBy": true
  }]
}

/(?!a)/

{
  "type": ret.types.ROOT,
  "stack": [{ 
    "type": ret.types.GROUP, 
    "stack": { "type": ret.types.CHAR, "value": 97 },
    "remember": false,
    "notFollowedBy": true
  }]
}

/a|b/

{
  "type": ret.types.ROOT,
  "options": [
    [{ "type": ret.types.CHAR, "value": 97 }], 
    [{ "type": ret.types.CHAR, "value": 98 }] 
  ]
}

/(a|b)/

{
  "type": ret.types.ROOT,
  "stack": [
    "type": ret.types.GROUP,
    "remember": true,
    "options": [
      [{ "type": ret.types.CHAR, "value": 97 }], 
      [{ "type": ret.types.CHAR, "value": 98 }] 
    ]
  ]
}

/^/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.POSITION,
    "value": "^"
  }]
}

/$/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.POSITION,
    "value": "$"
  }]
}

/\b/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.POSITION,
    "value": "b"
  }]
}

/\B/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.POSITION,
    "value": "B"
  }]
}

/\1/

{
  "type": ret.types.ROOT,
  "stack": [{
    "type": ret.types.REFERENCE,
    "value": 1
  }]
}

Install

npm install ret

Tests

Tests are written with vows

npm test

Security

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.