npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@adguard/re2-wasm

v1.2.0

Published

Google's RE2 library distributed as a WASM module patched by AdGuard.

Downloads

418

Readme

Fork disclaimer

This is a fork of re2-wasm that allows specifying maximum memory limit for a regular expression. Having this version is important for properly validating regular expressions in a browser extension that uses Declarative Net Request.

How to clone: This library contains a submodule, so you need to clone it recursively:

    git clone --recursive [email protected]:AdguardTeam/re2-wasm.git

How to compile locally:

If your machine has supported docker images (Mac OS on M1 does not have them yet):

npm install
npm run compile-emcc # or check prerequisites for Mac OS below
npm run compile-ts

If not:

Prerequisites for Mac OS: You need to have installed Emscripten SDK. https://emscripten.org/docs/getting_started/downloads.html#download-and-install

# Get the emsdk repo
git clone https://github.com/emscripten-core/emsdk.git

# Enter that directory
cd emsdk

# Fetch the latest version of the emsdk (not needed the first time you clone)
git pull

# Download and install the latest SDK tools.
./emsdk install latest

# Make the "latest" SDK "active" for the current user. (writes .emscripten file)
./emsdk activate latest

# Activate PATH and other environment variables in the current terminal
source ./emsdk_env.sh
npm install
npm run compile
npm run compile-ts

Run tests to check that it works (see test_invalid.js for the test that checks that memory limit actually works):

npm run test

Below is the original README content.

re2-wasm NPM version

This is not an officially supported Google product.

This README is modified from the node-re2 README, licensed under The "New" BSD License

This project provides bindings for RE2: fast, safe alternative to backtracking regular expression engines written by Russ Cox. To learn more about RE2, start with an overview Regular Expression Matching in the Wild. More resources can be found at his Implementing Regular Expressions page.

RE2's regular expression language is almost a superset of what is provided by RegExp (see Syntax), but it lacks two features: backreferences and lookahead assertions. See below for more details.

RE2 object emulates standard RegExp making it a practical drop-in replacement in most cases. RE2 is extended to provide String-based regular expression methods as well. To help to convert RegExp objects to RE2 its constructor can take RegExp directly honoring all properties.

Why use re2-wasm?

The built-in Node.js regular expression engine can run in exponential time with a special combination:

  • A vulnerable regular expression
  • "Evil input"

This can lead to what is known as a Regular Expression Denial of Service (ReDoS). To tell if your regular expressions are vulnerable, you might try the one of these projects:

However, neither project is perfect.

re2-wasm can protect your Node.js application from ReDoS. re2-wasm makes vulnerable regular expression patterns safe by evaluating them in RE2 instead of the built-in Node.js regex engine.

Standard features

RE2 object can be created just like RegExp:

Supported properties:

Supported methods:

The following well-known symbol-based methods are supported (see Symbols):

It allows to use RE2 instances on strings directly, just like RegExp instances:

var re = new RE2("1", 'u');
"213".match(re);        // [ '1', index: 1, input: '213' ]
"213".search(re);       // 1
"213".replace(re, "+"); // 2+3
"213".split(re);        // [ '2', '3' ]

Named groups are supported.

Extensions

Shortcut construction

RE2 object can be created from a regular expression:

var re1 = new RE2(/ab*/igu); // from a RegExp object
var re2 = new RE2(re1);     // from another RE2 object

String methods

Standard String defines four more methods that can use regular expressions. RE2 provides them as methods exchanging positions of a string, and a regular expression:

Property: internalSource

Starting 1.8.0 property source emulates the same property of RegExp, meaning that it can be used to create an identical RE2 or RegExp instance. Sometimes, for troubleshooting purposes, a user wants to inspect a RE2 translated source. It is available as a read-only property called internalSource.

Unicode Mode

The RE2 engine only works in Unicode mode, so the RE2 class must always be constructed with the u flag to enable unicode mode.

How to install

Installation:

npm install --save re2-wasm

How to use

It is used just like a RegExp object.

var { RE2 } = require("re2-wasm");

// with default flags
var re = new RE2("a(b*)", 'u');
var result = re.exec("abbc");
console.log(result[0]); // "abb"
console.log(result[1]); // "bb"

result = re.exec("aBbC");
console.log(result[0]); // "a"
console.log(result[1]); // ""

// with explicit flags
re = new RE2("a(b*)", "iu");
result = re.exec("aBbC");
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"

// from regular expression object
var regexp = new RegExp("a(b*)", "iu");
re = new RE2(regexp);
result = re.exec("aBbC");
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"

// from regular expression literal
re = new RE2(/a(b*)/iu);
result = re.exec("aBbC");
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"

// from another RE2 object
var rex = new RE2(re);
result = rex.exec("aBbC");
console.log(result[0]); // "aBb"
console.log(result[1]); // "Bb"

// shortcut
result = new RE2("ab*", 'u').exec("abba");

Limitations (things RE2 does not support)

RE2 consciously avoids any regular expression features that require worst-case exponential time to evaluate. These features are essentially those that describe a Context-Free Language (CFL) rather than a Regular Expression, and are extensions to the traditional regular expression language because some people don't know when enough is enough.

The most noteworthy missing features are backreferences and lookahead assertions. If your application uses these features, you should continue to use RegExp. But since these features are fundamentally vulnerable to ReDoS, you should strongly consider replacing them.

RE2 will throw a SyntaxError if you try to declare a regular expression using these features. If you are evaluating an externally-provided regular expression, wrap your RE2 declarations in a try-catch block. It allows to use RegExp, when RE2 misses a feature:

var re = /(a)+(b)*/u;
try {
  re = new RE2(re);
  // use RE2 as a drop-in replacement
} catch (e) {
  // suppress an error, and use
  // the original RegExp
}
var result = re.exec(sample);

In addition to these missing features, RE2 also behaves somewhat differently from the built-in regular expression engine in corner cases.

Backreferences

RE2 doesn't support backreferences, which are numbered references to previously matched groups, like so: \1, \2, and so on. Example of backrefrences:

/(cat|dog)\1/.test("catcat"); // true
/(cat|dog)\1/.test("dogdog"); // true
/(cat|dog)\1/.test("catdog"); // false
/(cat|dog)\1/.test("dogcat"); // false

Lookahead assertions

RE2 doesn't support lookahead assertions, which are ways to allow a matching dependent on subsequent contents.

/abc(?=def)/; // match abc only if it is followed by def
/abc(?!def)/; // match abc only if it is not followed by def

Mismatched behavior

RE2 and the built-in regex engines disagree a bit. Before you switch to RE2, verify that your regular expressions continue to work as expected. They should do so in the vast majority of cases.

Here is an example of a case where they may not:

var { RE2 }  = require("re2-wasm");

var pattern = '(?:(a)|(b)|(c))+';

var built_in = new RegExp(pattern);
var re2 = new RE2(pattern, 'u');

var input = 'abc';

var bi_res = built_in.exec(input);
var re2_res = re2.exec(input);

console.log('bi_res: ' + bi_res);    // prints: bi_res: abc,,,c
console.log('re2_res : ' + re2_res); // prints: re2_res : abc,a,b,c

Unicode

RE2 only works in the Unicode mode. The u flag must be passed to the RE2 constructor.

License

Apache 2.0