npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

etl-influx

v0.0.1

Published

library to assist with etl of data to influxdb

Downloads

2

Readme

etl-influx

This is a node module that provides some functions to help easily etl data from different data sources into influxdb. Currently etl from csv file, and from splunk is supported. Other sources such as elastic search may be added in future.

Installation

$ npm install --save etl-influx

Usage

Query and Mapping API

Please see the source for jsdoc style comments describing the query and mapping interfaces

a src/ingesters/map-to-point.ts
a src/ingesters/ingester.ts

Monitoring progress

Basic progress events are emitted by each ingester. The CSV ingester will emit one event every 10000 lines processed, and the splunk ingester after each time slice has been processed.

Clean Up / Destroying ingesters

After you are finished with an ingester, it is reccomended that you destroy it by calling the destroy() method. This is to clean up intervals that are used to batch point submissions to influxdb.

CSV File Ingester

Given a CSV file with a format like:

timestamp,service,route,result,elapsed
"2019-08-08T15:51:53.175+0000","my-service","some/api/endpoint",200,14
"2019-08-08T15:51:52.145+0000","my-service","some/api/endpoint",200,24

You can ETL to influxdb like so. Note that the mapping from fields should correspond to column names defined on the first line of the file.

The parseLine function is required to allow you to specify the field delimiter, and optionally clean up the data (eg: remove quotes as in example)

import {CsvIngester} from "etl-influx"

const config = {
    influx: {
      "host": "localhost",
      "protocol": "http",
      "username": "root",
      "password": "root",
      "database": "test"
    }
}

const query = {
    filename: "./data/1565279533_36391.csv",
    parseLine: line => line.split(",").map(it => it.replace(/"/g, "")),
    measurement: "request_durations",
    mappings: [
        { type: "timestamp", from: "_time", transform: value => moment(value).valueOf() },
        { type: "tag", from: "service", to: "service_name", default: "unknown" },
        { type: "tag", from: "route", to: "operation_name", default: "unknown" },
        { type: "tag", from: "result", to: "status_code", default: "unknown" },
        { type: "field", from: "elapsed", to: "duration", transform: value => parseInt(value, 10) },
    ],
}

(await CsvIngester.create(config))
        .onProgress(({totalPoints}) => console.log(`processed ${totalPoints} so far`))
        .ingest(query)

Splunk Historic Rest Ingester

You can ETL data out of splunk using it's rest api. Note that this is implemented via series of searches using a time range based on the stepSize duration parameter.

The mappings provided are parsed into a | table col1, col2, col3 statement which is appended to the provided query, where col1, etc is sourced from the from property.

import {SplunkHistoricRestIngester} from "etl-influx"

const config = {
    influx: {
      "host": "localhost",
      "protocol": "http",
      "username": "root",
      "password": "root",
      "database": "test"
    },
    splunk: {
      "host": "localhost",
      "username": "admin",
      "password": "admin"
    }
}
const query = {
    from: moment().subtract(1, "day").startOf("day"),
    to: moment(),
    stepSize: moment.duration(1, "hour"),

    query: `index=* message="request complete" elapsed != null`,

    measurement: "request_durations",
    mappings: [
            { type: "timestamp", from: "_time", transform: value => moment(value).valueOf() },
            { type: "tag", from: "service", to: "service_name", default: "unknown" },
            { type: "tag", from: "route", to: "operation_name", default: "unknown" },
            { type: "tag", from: "result", to: "status_code", default: "unknown" },
            { type: "field", from: "elapsed", to: "duration", transform: value => parseInt(value, 10) },
    ],
}

(await SplunkHistoricRestIngester.create(config))
    .onProgress(({totalPoints}) => console.log(`processed ${totalPoints} so far`))
    .ingest(query)

Future Plans (contributions welcome)

  • Enhance progress reporting

  • A real time splunk ingester (I have one written, but untested due to insufficient permissions on the splunk instance I'm testing against)

  • A elastic search ingester

  • Add unit tests