npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

napi-oniguruma

v0.1.0

Published

NAPI bindings for the Oniguruma regex library

Downloads

3

Readme

Build Status Dependency Status DevDependency Status

napi-oniguruma

N-API bindings for the Oniguruma regex library

Features

This module uses the lastest Oniguruma regex release (6.9.4), and aims to keep it up to date. It uses N-API bindings, allowing the same binary to work across all future NodeJS versions. The implementation is based on node-oniguruma, but rewritten in C, some interface differences, and with added promise support.

To interface with the regular expressions, there is a choice of

  • Synchronous calls with scanner.findNextMatchSync (fastest, but blocks the main thread)
  • Callbacks with scanner.findNextMatchCb (regex match is run on separate thread)
  • Promises with scanner.findNextMatch (same as callbacks)

Performance

When I run the node-oniguruma benchmarks on it's own library, I get the following results

oneline.js
sync:  149652 matches in 1778ms
large.js
sync:  14594 matches in 162ms
Segmentation fault (core dumped)

When I run the same benchmarks on this library, I get

oneline.js
sync:  149652 matches in 2985ms
large.js
sync:  14594 matches in 186ms
async: 14594 matches in 1055ms

They tend to have some variance, but the above figures are fairly typical. From the above results, you can expect a significant performance drop when run on a single line JSON, and a 1.15 times slowdown when searching for 'this', 'var', 'selector', 'window' on the JQuery source.

In general, a drop in performance is to be expected. N-API is an abstraction over the underlying JS VM, so cannot compete with node-oniguruma using the V8 API's. However, what we lose in performance we gain in stability, so you can depend on prebuilt binaries being valid without active maintenance. I am interested in seeing if NAN bindings can also be implemented in this project too though, and comparing their performance to those in node-oniguruma.

In a point to this library though, we don't segfault on async operations. Improving the benchmarks to better reflect real usage is a goal for this module.

Thread safety

The entry and exit points of all async methods run on the main thread, so are inherently safe. The biggest concern is the work that gets done in the threadpool. In particular, the regex cannot cache results without protection. To fix this, we only update the cache during sync searches (where everything runs on the main thread). The goal will be to eventually add atomic operations to update values, even during async work (but Windows VS doesn't seem to support atomics). Maybe even make the cache a separate object, so a regex can have multiple caches, etc.

Indexing

The API supports starting from a given index within the string. This has been interpreted as JS string index, and not unicode code point or similar. Hopefully support for actual code point indexing can be added.

NOTE

  • There is a bug in Node 10.15.3--12.0.0 that causes a memory leak when using the async (promise and callback) methods.

TODO

  • [X] Refactor C code to reduce duplication
  • [X] Fix the G anchor detection
  • [X] Comprehensive tests
  • [ ] Set up proper benchmarks
  • [X] Typescript types
  • [ ] ~~Batch the property setting into a single N-API call~~
    • Not done, as could not detect any measurable improvement, and makes the properties invisible to the tests.