auntie
v0.25.0
Published
Auntie, my dear ultra-fast module for untying/splitting/counting a stream of data by a chosen sequence/separator.
Downloads
118
Maintainers
Readme
Auntie
Auntie, my dear ultra-fast module for untying/splitting/counting a stream of data by a chosen separator sequence.
It uses Bop under the hood, a Boyer-Moore parser, optimized for sequence lengths up to 255 bytes.
Table of Contents
Install
$ npm install auntie [-g]
require:
const Auntie = require( 'auntie' );
Run Tests
to run all test files, install devDependencies:
$ cd auntie/
# install or update devDependencies
$ npm install
# run tests
$ npm test
to execute a single test file, simply do:
$ node test/file-name.js
output example and running time:
...
- current path is 'test'.
- time elapsed: 106.596 secs.
26 test files were loaded.
26 test files were launched.
1272671 assertions succeeded.
Run Benchmarks
$ cd auntie/
$ npm run bench
to execute a single bench file, simply do:
$ node bench/file-name.js
Constructor
Arguments between [ ] are optional.
Auntie( [ Buffer | String | Number sequence ] )
or
new Auntie( [ Buffer | String | Number sequence ] )
NOTE: default is the
CRLF sequence \r\n
.
Properties
NOTE: do not mess up with these properties.
The current sequence for splitting data
Auntie.seq : Buffer
the Boyer-Moore parser, under the hood.
Auntie.bop : Bop
a Boyer-Moore parser, to search for generic (sub)sequences
Auntie.gbop : Bop
the remaining data, without any match found.
Auntie.snip : Buffer
the remaining data, used for counting.
Auntie.csnip : Buffer
the current number of matches, min/max distance, remaining bytes.
Auntie.cnt : Array
Methods
| name | description |
|:--------------------------|:---------------------------------------------------------------------------------|
| count | count (only) how many times the sequence appears in the current data.
|
| dist | count occurrences, min and max distance between sequences and remaining bytes.
|
| do | split data or a stream of data by the current sequence.
|
| flush | flush the remaining data, resetting internal state/counters.
|
| set | set a new sequence for splitting data.
|
| comb | search a char or a sequence into the current data.
|
Arguments between [ ] are optional.
Auntie.count
the fastest/lightest way to count how many times the sequence appears in the current data.
/*
* it returns an Array with the current number of occurrences.
*
* NOTE: it saves the minimum necessary data that does not contains
* the sequence, for the next #count call with fresh data (to check
* for single occurrences between 2 chunks of data.
*/
'count' : function ( Buffer data ) : Array
Auntie.dist
count occurrences, min and max distance between sequences and remaining bytes.
/*
* it returns an Array with:
* - the current number of occurrences
* - the minimum distance, in bytes, between any 2 sequences
* - the maximum distance, in bytes, between any 2 sequences
* - the remaining bytes to the end of data (without any matching sequence)
*
* NOTE:
* - also the distance from index 0 to the first match will be considered
* - it saves the remaining data that does not contains the sequence,
* for the next #dist call with fresh data, to check for occurrences
* between chunks).
*/
'dist' : function ( Buffer data ) : Array
Auntie.do
split data or a stream of data by the current sequence
/*
* if collect is true, it returns an Array of data slices; otherwise, it
* emits a 'snap' event for every slice; then, after having finished to
* parse data, it emits a 'snip' event, with the remaining data that does
* not contain the sequence ( the current Auntie.snip property ).
*
* NOTE: it saves the remaining data that does not contains the
* sequence, for the next #do call on fresh data (to check for
* occurrences between chunks).
*/
'do' : function ( Buffer data [, Boolean collect ] ) : [ Array ]
Auntie.flush
flush the remaining data, resetting internal state/counters
/*
* if collect is true it returns a Buffer, otherwise it emits
* a 'snip' event with data. Obviously the snip doesn't contain
* the sequence (no match). It is equal to get and reset the
* internal me.snip property.
*/
'flush' : function ( [ Boolean collect ] ) : [ Buffer ]
Auntie.set
set a new sequence for splitting data.
// default sequence is '\r\n' or CRLF sequence.
'set' : function ( [ Buffer | String | Number sequence ] ) : Auntie
Auntie.comb
search for a char or a sequence into the current data.
/*
* parse current data for a generic sequence. It returns an Array of indexes.
* NOTE: it doesn't affect the current streaming parser and it doesn't save
* any data. It simply parses a chunk of data for the specified sequence,
* optionally from a starting index and limiting results to a specified number
* of occurrences (like Bop.parse does).
*/
'comb' : function ( Buffer | String seq, Buffer data [, Number from [, Number limit ] ] ) : Array
Events
Auntie emits only 2 types of events:
snap
andsnip
.
!snap a result.
'snap' : function ( Buffer result )
!snip current remaining data (with no match found).
'snip' : function ( Buffer result )
NOTE: if the 'collect' switch for the do/flush was set (true), then no event will be emitted.
Examples
split lines from a CSV file (CRLF):
count lines from a file (CRLF):
snap event and collect (CRLF):
See All examples.
MIT License
Copyright (c) 2017-present < Guglielmo Ferri : [email protected] >
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the 'Software'), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.