npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

regexact

v0.2.2

Published

RegExact extends RegExp to return indexes of matched substrings related to capturing groups.

Downloads

6

Readme

RegExact 0.2.2

MIT license

RegExact extends RegExp to return indexes of matched substrings related to capturing groups.

Installation

npm install regexact

Usage

To use the RegExact class in a JavaScript file:

const RegExact = require('regexact').RegExact

const regExact = new RegExact(/Hello (.*(ld))/i)
const matches = regExact.exec('I wish you a big Hello World')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: Hello World at 17
// 1: World at 23
// 2: ld at 26

To use the RegExact class in a TypeScript file:

import { RegExact } from 'regexact';

const regExact = new RegExact(/Hello (.*(ld))/i);
const matches = regExact.exec('I wish you a big Hello World');

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i]);
  }
} else {
  console.log('No matches.')
}

// 0: Hello World at 17
// 1: World at 23
// 2: ld at 26

RegExact acts as a wrapper around the built-in RegExp that redefines its constructor and its exec method. The redefined exec() method extends the original returned object with an array named indexes. The redefined returned object is therefore an array (plus original extra properties index and input) with a new extra property indexes which contains indexes of matches that are stored in the original returned array in the same order.

The following table shows the results for the upper code:

| Variable | Value | Description | | -------- | ----- | ----------- | | matches[0] | "Hello World" | The string of the full match. | | matches[1] | "World" | The substring of the match related to the first capturing group. | | matches[2] | "ld" | The substring of the match related to the second capturing group. | | matches.indexes[0] | 17 | The index of the full match. This is a duplicate of index | | matches.indexes[1] | 23 | The index of the match related to the first capturing group | | matches.indexes[1] | 26 | The index of the match related to the second capturing group | | index | 17 | The index of the full match. This is a duplicate of indexes[0] | | input | "I wish you a big Hello World" | The string that was matched against. |

Motivation

Javascript standard build-in class RegExp enables matching text with a regex pattern by discovering matched substring and its index. For capturing groups, RegExp only returns their matching substrings and has no support for obtaining indexes of these substrings. RegExact extends the RegExp functionality to return indexes of matched substrings related to the regular expression capturing groups as well.

RegExp Features Support

RegExact returns indexes of matched substrings defined by capturing groups using any correct JavaScript (ES2018) regular expression including nested groups, quantified groups, non-capturing groups, lookaheads, lookbehinds, named capturing groups, alternations...

Execution time

To support and be compatible with all correct JavaScript regular expressions, RegExact is build on the top of the build-in JavaScript RegExp class. It is implemented as a wrapper around the RegExp and it has first to construct and latter to manipulate its own (simplified) abstract syntax tree of the regular expression. The downside of this approach is an execution time penalty.

The execution time for the constructor (which is typically called only once) is always increased somewhere around 10 times. The increase of the execution time for the exec method (which is typically called many times) depends mainly on the complexity of the regular expression. In the case of a simple regular expression, there may not be any time penalty. In the case of a regular expression with many (nested) groups, the increase may amount to 20 times or more of the original execution time.

Some execution time optimizations are already implemented, but there is still room for further optimizations that are planned for future releases.

Limitations

RegExact support for named capturing groups does not yet enable getting their indexes via their names. To get the indexes of named capturing groups, it is necessary to access them as if groups were ordinary numbered capturing groups.

Examples

RegExact object constructed from regex literal with capturing groups that are greedy quantified with * :

const RegExact = require('regexact').RegExact

const regExact = new RegExact(/([a-z])*([a-z])/)
const matches = regExact.exec('abcde12345')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: abcde at 0
// 1: d at 3
// 2: e at 4

RegExact object constructed from RegExp object with capturing groups that are lazy quantified with *? :

const RegExact = require('regexact').RegExact

const regExact = new RegExact(new RegExp(/([a-z])*?([a-z])/))
const matches = regExact.exec('abcde12345')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: a at 0
// 1: undefined at undefined
// 2: a at 0

RegExact object constructed from RegExp object using a pattern string and a flags string with alternation :

const RegExact = require('regexact').RegExact

const regExact = new RegExact(new RegExp('ab(c)|cd(e)', 'i'))
const matches = regExact.exec('cde')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: cde at 0
// 1: undefined at undefined
// 2: e at 2

RegExact object with nested capturing and non-capturing groups and quantifiers :

const RegExact = require('regexact').RegExact

const regExact = new RegExact(/(?:(<.+>)[0-9]+){2}/)
const matches = regExact.exec('What is this:<name>0<Mark>31415')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: <name>0<Mark>31415 at 13
// 1: <Mark> at 25

RegExact objects with a multidigit backreference and with an octal escape that are interpreted accordingly to the expression context:

const RegExact = require('regexact').RegExact

const text = ('123446789ab\t')
const matches1 = new RegExact(/(.)(.)()()()()()()()()(.)\11/).exec(text) // \11 is a group ref
const matches2 = new RegExact(/(.)(.)\11/).exec(text) // \11 is the escape of a tab

if (matches1) {
  for (let i = 0; i < matches1.length; i++) {
    console.log(i + ': ' + matches1[i] + ' at ' + matches1.indexes[i])
  }
} else {
  console.log('No matches.')
}
console.log()
if (matches2) {
  for (let i = 0; i < matches2.length; i++) {
    console.log(i + ': ' + matches2[i] + ' at ' + matches2.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: 2344 at 1
// 1: 2 at 1
// 2: 3 at 2
// 3: at 3
// 4: at 3
// 5: at 3
// 6: at 3
// 7: at 3
// 8: at 3
// 9: at 3
// 10: at 3
// 11: 4 at 3

// 0: ab    at 9
// 1: a at 9
// 2: b at 10

RegExact object with a negative lookbehind :

const RegExact = require('regexact').RegExact

const regExact = new RegExact(/(?<!\$|usd|\d|\.)(\d+)\.?(\d{1,2})?/i)
const matches = regExact.exec('$1.23 eur108.00 USD0123 eur1999')

if (matches) {
  for (let i = 0; i < matches.length; i++) {
    console.log(i + ': ' + matches[i] + ' at ' + matches.indexes[i])
  }
} else {
  console.log('No matches.')
}

// 0: 108.00 at 9
// 1: 108 at 9
// 2: 00 at 13

Credits

Developed by Marko Privosnik and released under the MIT License.