npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

revexp

v1.2.1

Published

reverse regular expression computation

Downloads

7

Readme

RevExp

This library calculates the possible inputs for a given regular expression.

The supported syntax is modelled on the standard Javascript RegExp object's syntax.

You can try it online or in the terminal:

npx revexp 'H(alp|el{2,}[o]) (Wor|wo)lds?' '??????W????'

Supported Features

| Pattern Feature | Examples | Support | |--------------------------|-----------------------------|---------------| | Character classes | a [aeiou] [^a-z] \d | ✅ Full | | Character escapes | \u0020 \\ \n \( | ✅ Full | | Unicode property escapes | \p{Letter} | ❌ None | | Branching | x\|y | ✅ Full | | Anchors | ^ $ | ✅ Full | | Word boundary assertions | \b \B | Partial † | | Standard quantifiers | a* a+ a? | ✅ Full | | Range quantifiers | a{2} a{2,} a{2,3} | ✅ Full | | Lazy quantifiers | a+? a*? a?? a{2,}? | ✅ Full | | Possessive quantifiers | a++ a*+ a?+ a{2,}+ | ❌ None | | Groups | (abc) (?<name>abc) | ✅ Full | | Backreferences | (abc)\1 (?<x>abc)\k<x> | Partial ‡ | | Non-capturing groups | (?:abc) | ✅ Full | | Positive lookaheads | (?=abc) | ❌ None | | Negative lookaheads | (?!abc) | ❌ None | | Positive lookbehinds | (?<=abc) | ❌ None | | Negative lookbehinds | (?<!abc) | ❌ None |

†: Two passes may be required to fully narrow an input when using this feature.

‡: Backreferences are currently handled as if the group were repeated (with an optional qualifier), meaning the syntax is accepted and any full or partial valid input will be accepted, but full or partial invalid inputs may not be rejected, and the library may not be able to fully narrow inputs when using patterns which include this feature.

| Flag | Meaning | Support | |------|--------------------------------|----------------| | d | Output substring match indices | Not applicable | | g | Global search | Not applicable | | i | Case-insensitive search | None (false) | | m | Multi-line search | ✅ Full | | s | . matches newline | ✅ Full | | u | Unicode mode | None (false) | | y | Sticky search | Not applicable |

Usage

Library

npm install --save revexp
import RevExp from 'revexp';

// compile the pattern
const pattern = new RevExp('a[bc]|x[yz]');

// provide already known information - the first character is unknown, the second must be "c"
const known = RevExp.string('?c', '?');

// resolve as much of the data as possible
const resolved = pattern.reverse(known);

// print it in a human-friendly format
console.log(String(resolved)); // "ac"

CLI

npx revexp 'a[bc]|x[yz]' '?c'

You can also optionally change the "unknown character":

npx revexp -u'.' 'a[bc]|x[yz]' '.c'

Examples

| pattern | input | output | |----------------------|------------|------------------------| | (hello\|hi) | ?? | hi | | (hello\|hi) | ????? | hello | | (hello\|hi) * | ????? | h[ei][ l][ l][ o] | | (hello\|hi) * | ???? | hi | | (a[bc]d\|e(f\|gh)) | ??? | [ae][bcg][dh] | | (a[bc]d\|e(f\|gh)) | ?g? | egh | | *a{4} +a{2} * | ???????? | [ a]aaa[ a][ a]a[ a] |

Performance and Security Considerations

Matching is performed using an algorithm which executes in polynomial time on the size of the input for a given pattern. This means it is not vulnerable to catastrophic backtracking (but note that untrusted patterns can take arbitrarily long to compile and use an unbounded amount of memory).

API

RevExp

This object stores a compiled regular expression. It is analogous to Javascript's native RegExp. Construction is expensive, as it must compile the regular expression into a graph form. Once created, the RevExp object can be used repeatedly.

constructor

const revexp = new RevExp('pattern');
const revexp = new RevExp(/existing regex/);
const revexp = RevExp.compile('pattern');

Creates a new RevExp object, compiling the pattern into an internal graph ready for use. The .compile form is just an alias; functionally both forms are the same.

If you pass in a RegExp or RevExp object, the pattern will be extracted and compiled. Note that this project is not 100% compatible with RegExp features (see above for details).

RevExp.string(input[, unknown])

const characterClassString = RevExp.string('foobar');
const characterClassString = RevExp.string('foob**', '*');
const characterClassString = RevExp.string(myCharClassArray);

Converts input into a character class array.

When invoked with a string input and no unknown parameter, the string is taken literally (i.e. it assumes all characters are known).

When invoked with a string input and an unknown parameter, any occurence of unknown within input is replaced with CharacterClass.ANY.

When invoked with an existing character class array, this will copy the array but make no changes to the values (this can be used to avoid mutating the original, for example).

The returned array has an overloaded toString which allows easy pretty-printing, and an isSingular method which returns true if all the contained character classes are singular.

.test(input[, unknown])

const success = myRevExp.test('foobar');
const success = myRevExp.test('foob**', '*');
const success = myRevExp.test(myCharClassArray);

Tests if input matches (or can match) the regular expression. Returns true if a match is possible, or false if no match is possible.

For an explanation of the input and unknown parameters, see the information for RevExp.string above.

For the most control, invoke with a character class array. This allows you to pass in exact ranges for each character. For example:

const input = [
  RevExp.CharacterClass.of('f'), // literal 'f'
  RevExp.CharacterClass.of('o'),
  RevExp.CharacterClass.of('o'),
  RevExp.CharacterClass.of('bz'), // 'b' or 'z'
  RevExp.CharacterClass.WORD, // a-zA-Z0-9_
  RevExp.CharacterClass.ANY, // anything
];
// input is equivalent to a pattern of 'foo[bz]\w.'
const success = myRevExp.test(input);
Examples
const revexp = new RevExp('(hello|hi+)');

console.log(revexp.test('hello')); // true
console.log(revexp.test('hi')); // true
console.log(revexp.test('nope')); // false
console.log(revexp.test('words hello words')); // false
console.log(revexp.test('h????')); // false
console.log(revexp.test('he???', '?')); // true
console.log(revexp.test('he??', '?')); // false

.reverse(input[, unknown])

const restrictions = myRevExp.reverse('foobar');
const restrictions = myRevExp.reverse('foob**', '*');
const restrictions = myRevExp.reverse(myCharClassArray);

Returns a character class array with the tightest possible restrictions on the characters based on the regular expression and the given input. If no matches are possible, returns null.

For an explanation of the input and unknown parameters, see the information for RevExp.string above.

The returned array has an overloaded toString which allows easy pretty-printing, and an isSingular method which returns true if all the contained character classes are singular.

Examples
const revexp = new RevExp('(hello|hi+)');

console.log(String(revexp.reverse('hello'))); // hello
console.log(String(revexp.reverse('nope'))); // null
console.log(String(revexp.reverse('he???'))); // null
console.log(String(revexp.reverse('he???', '?'))); // hello
console.log(String(revexp.reverse('h????', '?'))); // h[ei][il][il][io]

RevExp.CharacterClass

This object stores character class information (e.g. [a-z]). It also has various convenience constants defined for common character classes.

The CharacterClass object is immutable.

constants

RevExp.CharacterClass.ANY     // .
RevExp.CharacterClass.NONE    // [^]
RevExp.CharacterClass.NUMERIC // [0-9]
RevExp.CharacterClass.WORD    // [a-zA-Z0-9_]
RevExp.CharacterClass.NEWLINE // various unicode newline characters
RevExp.CharacterClass.SPACE   // various unicode space characters

constructor

There are various helpers available for building custom CharacterClasses:

RevExp.CharacterClass.of('c')
RevExp.CharacterClass.of('ax')
RevExp.CharacterClass.of(['a', 'x'])
RevExp.CharacterClass.ofCode(0x20)
RevExp.CharacterClass.range('a', 'z')
RevExp.CharacterClass.union(cc1, cc2, cc3, ...)

toString

As a convenience, CharacterClass has a toString which will render the class in the standard regular expression format;

console.log(String(RevExp.CharacterClass.of('c'))) // 'c'
console.log(String(RevExp.CharacterClass.of('ax'))) // '[ax]'
console.log(String(RevExp.CharacterClass.range('a', 'z'))) // '[a-z]'
console.log(String(RevExp.CharacterClass.of('abcdz'))) // '[a-dz]'
console.log(String(RevExp.CharacterClass.of('a').inverse())) // '[^a]'

includes(char)

Returns true if the character class contains the given character.

isEmpty()

Returns true if the character class contains no characters (i.e. is impossible).

isSingular()

Returns true if the character class contains exactly one character (i.e. is fully resolved).

singularChar()

Returns the single character represented by this character class. If the class represents more than one character (or no characters), this throws an error.

intersects(other)

Returns true if there is any overlap between the two character classes.

intersect(other)

Returns a new character class which contains only the characters shared by both character classes.

union(other)

Returns a new character class which contains all characters contained in either character class.

rangeTo(other)

Returns a new character class which represents the range of characters from the current character to the target character (inclusive at both sides). The order does not matter.

This method can only be invoked on singular character classes.

inverse()

Returns a new character class which is the inverse of the current class (i.e. any character in this class is not in the returned class, and vice-versa).