npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

segmenta

v1.0.37

Published

A fast API for managing and querying arbitrary data segments stored in Redis

Downloads

56

Readme

Segmenta: a segments api using Redis for storage

What does it do?

Provides a mechanism for storing and retrieving sets of numbers quickly as well as performing operations with those sets. Currently supported are:

  • and: produce the set C of numbers which are in both A and B
    • [ 1, 2, 3 ] and [ 2, 3, 4 ] = [ 2, 3 ]
  • or: produce the set C of numbers which are in either A or B
    • [ 1, 2, 3 ] or [ 2, 3, 4 ] = [ 1, 2, 3, 4 ]
  • not: produce the set C of numbers which are in A excluding those in B
    • [ 1, 2, 3 ] not [ 2, 3, 4 ] = [ 1 ]

How to use?

  1. import or require segmenta
    • javascript: const Segmenta = require("segmenta");
    • typescript: import Segmenta from "segmenta";
    • the only export of the library is the Segmenta class
  2. create an instance with options, if required:
    • const segmenta = new Segmenta()
    • const segmenta = new Segmenta(options)
      • options have the structure:
        {
          redisOptions?: RedisOptions,
          segmentsPrefix?: string,
          bucketSize?: number,
          resultsTTL?: number,
        }
        where:
        • RedisOptions are the options which would be passed to ioredis to initialize (eg host, port, etc)
        • segmentsPrefix is a prefix to apply to all segments keys (defaults to "segments")
        • bucketSize is the max size to use when creating buckets (defaults to 50kb)
        • resultsTTL is the time you'd like result snapshots to live for when not explicitly released (defaults to 1 day)
  3. Populating data
    • add adds ids to the segment
      await segmenta.add("my-segment", [ 1, 2, 3 ]);
      // my-segment now contains [ 1, 2, 3 ]
    • del deletes ids from the segment
      await segmenta.del("my-segment", [ 2, 3 ]);
      // my-segment now contains just [ 1 ]
    • put takes a sequence of add / del commands and performs them in order (useful for batching streaming data)
      await segmenta.put("my-segment", [
        { add: 5 },
        { del: 1 },
        { add: 4 },
        { del: 5 },
        { add: 1 }
      ]);
      // my-segment now contains [ 1, 4 ]
  4. Query
    • results are returned as an object with the shape:

      {
        ids: number[],
        skipped: number,
        count: number,
        resultSetId: string, // used to re-query against snapshot
        total: number
      }
    • simple queries are supported (entire segments), however a DSL is provided for easier querying. The client can perform and, or, and not operations on results if they are retreived as buffers (see example below in DSL area).

      1. Simple query, all results returned:
      await segmenta.add("my-new-segment", [ 10, 20, 30 ]);
      const result = await segmenta.query("my-new-segment");
      /* result looks like:
      {
        ids: [ 10, 20, 30 ],
        skipped: 0,
        count: 3,
        resultSetId: "4deee554-da28-4029-8231-98060fa014dc",
        total: 3
      }
      */
      1. Paged query:
      await segmenta.add("paged-results-segment", [ 1, 2, 3, 4, 5 ]);
      const result1 = await segmenta.query({
        query: "paged-results-segment",
        skip: 0,
        take: 2
      });
      /* result1 looks like:
      {
        ids: [ 1, 2 ],
        skipped: 0,
        count: 2,
        resultSetId: "63c6a1f0-8aec-4249-9d80-63c5de13b942",
        total: 5
      }
      */
      // the rest of the results can be obtained with:
      const result2 = await segmenta.query({
        query: "63c6a1f0-8aec-4249-9d80-63c5de13b942",
        skip: 2
      });
      1. Paged results are snapshot and can be re-queried by using their id (uuid). Snapshots automatically expire after 24 hours (or the number of seconds specified by resultsTTL in your constructor arguments. You may manually dispose of results when you no longer need them:
      const result = await segmenta.query({ query: "my-set", skip: 0, take: 10 });
      await segmenta.dispose(result.resultSetId);

      Snapshots are only created when queries are performed with a positive integer skip or take value

    • There is a DSL for querying in a more readable manner:

      await segmenta.add("set1", [ 1, 2 ]);
      await segmenta.add("set2", [ 3, 4, 5 ]);
      await segmenta.add("set3", [ 2, 3, 5, 6, 7 ]);
      await segmenta.add("set4", [ 5, 6 ]);
      // ... some time later ...
      const query = "get where in 'set1' or 'set2' and 'set3' not 'set4'";
      const result1 = await segmenta.query(query);
      // or, with paging options:
      const result2 = await segmenta.query({ query, skip: 10, take: 100 });
      
      // the query syntax above is analogous to the following
      //  more manual query mechanism:
      const set1 = await segmenta.getBuffer("set1");
      const set2 = await segmenta.getBuffer("set2");
      const set3 = await segmenta.getBuffer("set3");
      const set4 = await segmenta.getBuffer("set4");
      // these operations are fast, acting on bitfields in memory.
      const final = set1          // [ 1, 2 ]
                      .or(set2)   // [ 1, 2, 3, 4, 5 ]
                      .and(set3)  // [ 2, 3, 5 ]
                      .not(set4)  // [ 2, 3 ]
                      .getOnBitPositions()
                      .values; // returns the numeric array for bit positions

      One may also query for counts only:

      await segmenta.query("count where in 'x');

      One may also query for results to come back in a random order:

      await segmenta.query("random where in 'x');

      When paging is requested and a resultSetId is returned, requering against that result-set will maintain the original randomized order.

      Query syntax is quite simple:

      (GET | COUNT | RANDOM) WHERE IN('segment-id')
                  [(AND|OR|NOT) IN('other-segment')]...
                  [MIN {int}]
                  [MAX {int}]
                  [SKIP {int}]
                  [TAKE {int}]
      • segments are identified by strings (single- or double-quoted)
      • only two operations are supported: GET and COUNT
      • the results of COUNT look like GET except no segment data is returned. Use the total field in the result to read your count value.
      • boolean operations are run left-to-right
      • operations may be grouped with brackets, in which case they are evaluated first, eg: GET WHERE IN('x') AND NOT (IN('y') OR IN('z'))
        • retreives values which are in 'x' and also not in 'y' or 'z';
      • brackets around segment ids are optional: GET WHERE IN 'x' is equivalent to GET WHERE IN('x')
      • the IN keyword is optional after the first usage: GET WHERE IN 'x' and IN 'y' is equivalent to GET WHERE IN 'x' AND 'y'
      • syntax is case-insensitive GET WHERE IN 'x' is equivalent to get where in 'x' and Get Where In('x')
      • skip and take can also be set on the query options -- when doing so, the skip/take values on query options take precedence. This allows easy re-use of natural-language query with changing paging values, but also facilitates natural language paging if that is your preference.
      • MIN and MAX set minimum and maximum values to bring back in the result set. These values are inclusive. This may be useful if SKIP doesn't suit your chunking needs, but rather setting a MIN and a TAKE
      • min and max can also be set on query options. As with skip and take, the query options values for min and max override any natural language specification
      • segment ids are case-sensitive
        • get where in 'MY-SEGMENT' is NOT equivalent to get where in 'my-segment'
      • segment ids may not contain quotations
        • they must be valid redis keys
      • some queries will never make sense, so expect either strange results or parse errors: GET WHERE NOT IN 'x'
        • since the segments are open-ended, this is essentially an infinite set of all numbers, excluding those in segment 'x'