npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@plato/analytics

v0.1.6

Published

Track custom analytics in Node.js, and persist to long-term storage

Downloads

105

Readme

@plato/analytics

Node.js CI npm version

Track custom analytics in Node.js, and persist to long-term storage, such as a data lake.

Usage

Install via Yarn (or npm):

yarn add @plato/analytics

Define a schema, create a collector, and store events in S3:

import { Collector, StoreS3 } from "@plato/analytics";

// STEP 1: Create a schema for your data =======================================
// This will give you strong typing when calling .track

/** Demo analytics schema */
type ExampleSchema = {
	/** A table of user events associated with a resource */
	user_events: {
		/** The time at which the event occurred */
		event_time: Date;
		/** The type of event which occurred */
		event_type: string;
		/** A resource ID associated with this event */
		resource_id: string;
		/** A user ID who performed the event */
		user_id: string;
	};
};

// STEP 2: Create a long-term data store =======================================
// Batches of analytics data are sent to long term storage when they reach
// specified thresholds, or when the analytics collector is stopped

// Create S3 storage layer which writes data to a specified bucket
const store = new StoreS3("my_bucket");

// STEP 3: Create an analytics collector =======================================

// Create analytics collector respecting our DemoSchema
const analytics = new Collector<ExampleSchema>(store);

// Receive errors from the collector
analytics.onError.receive((e) => console.log(`ERROR: ${e.message}`));

// Receive flush notifications from the collector
analytics.onFlush.receive((info) => console.log(`FLUSH: ${JSON.stringify(info)}`));

// STEP 4: TRACK ALL THE THINGS! ===============================================

// Track a demo event, for example
analytics.track("user_events", {
	event_time: new Date(),
	event_type: "touch",
	resource_id: "9438b068-c25b-47a0-b0fe-e8acc6f80ace",
	user_id: "03d00e69-ea1e-4c39-ade6-803ff3f00e99",
});

// STEP 5: Stop the collector ==================================================

// Stop collecting and wait for graceful shutdown
await analytics.stop();

See the API Reference for more information.

Wildcard Tables

The collector supports wildcard table names, which share a schema. For example, let's say we required a set of tables with custom events for specific games. We would define the schema for these table as follows:

type ExampleWildcardSchema = {
	game_custom_event_$: { // <--- Notice the "$" wildcard token in the table name
		event_time: Date;
		event_type: string;
		event_value: string;
		session_id: string;
	};
}

When tracking events, supply a third parameter with the wildcard value, for example:

// Track a custom event for Pool
analytics.track("game_custom_event_$", {
	event_time: new Date(),
	event_type: "foo",
	event_value: "bar",
	session_id: "9438b068-c25b-47a0-b0fe-e8acc6f80ace",
}, "pool"); // <--- Notice the wildcard value

// Track a custom event for Bowling
analytics.track("game_custom_event_$", {
	event_time: new Date(),
	event_type: "foo",
	event_value: "bar",
	session_id: "9438b068-c25b-47a0-b0fe-e8acc6f80ace",
}, "bowling"); // <--- Same schema, different wildcard value

This will result in two separate tables of data: game_custom_event_pool and game_custom_event_bowing (yet both sharing the schema defined for game_custom_event_$).

Specification

Objects

Objects are files within a data lake (such as AWS S3). Each object uploaded to the data lake will conform to the following specifications:

  1. Contain data for a single table adhering to the defined CSV format
  2. Compressed with gzip
  3. Object key names must conform to the following format: YYYY/MM/DD/HH/{TABLE_NAME}/{GUID}.csv(.gz)
    • YYYY/MM/DD/HH is the 4 digit year, 2 digit month, 2 digit day, and 2 digit hour of the UTC time when this object was created
    • {TABLE_NAME} must be unique across all data sources within this lake and contain only alphanumeric characters and underscores, e.g. [a-zA-Z0-9_]
    • Per convention, {TABLE_NAME} should contain a unique prefix indicating the data source, e.g. game_*
    • {GUID} is a Version 4 GUID

CSV Format

CSV files uploaded to the data lake will adhere to the following specifications:

  1. Conform to RFC 4180 - Common Format for Comma-Separated Values
  2. Contain only records for a single table
  3. Column headers must be declared in the first non-comment line
  4. Column headers must contain only alphanumeric characters ([a-zA-Z0-9]) and underscores (_)
  5. Column types should be declared as a comment in the first line, with the following specifications:
    • Aside from the comment character (#), the types row must contain a comma-separated list of text values indicating the column's data type
    • Valid data types are as follows:
      • text
      • timestamp(z)
      • int

Example CSV Data

#text,timestampz,int
session_id,end_time,end_reason
oiccvmz0cvuu-2my7wvf99enmi,2020-06-08T17:13:49.062Z,0
2slv18vtkjyl4-28nz20jhsdt3w,2020-06-08T17:13:49.112Z,2
2jk1eocbnkcys-2zffiy4q44z4o,2020-06-08T17:13:49.523Z,0