npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

flow-kafka-pipelines

v1.0.2

Published

Kafka pipelines for Flo.w

Downloads

2

Readme

Kafka Pipelines for Flo.w

Introduction

Kafka Pipelines for Flo.w is a NodeJS library for building Kafka stream processing microservices. The library is built around the idea of a 'pipeline' that consumes messages from a source and can produce messages to a sink. A simple domain-specific-language (DSL) is provided to build a pipeline by specifying a source, a sink, and intermediate processing steps such as map and filter operations.

The library builds heavily on NodeJS Readable and Writable streams, and Transforms. The library also builds on KafkaJS (the underlying Kafka library). You should refer to the documentation for that library to understand configuration options.

Installation

The repository for this library can be found at https://bitbucket.org/emu-analytics/flow-kafka-pipelines. You can install it as an NPM package into your project as shown below:

# Install NPM package
npm install flow-kafka-pipelines

API Documentation

To build the API documentation run:

npm run gen-doc

To view the documentation run:

npm run view-doc

Examples

See the examples directory for examples that work together (there are two flavours: JSON format and AVRO format).

Producer Pipeline

The producer pipeline reads lines from STDIN and produces a Kafka message to the 'topic1' topic for each line.

Processor Pipeline

The processor pipeline consumes messges from the 'topic1' topic and does some simple manipulation to demonstrate a typical consume-process-produce pipeline. The pipeline produces processed results to the 'topic2' topic. The processor pipeline demonstrates the use of the Kafka-backed in-memory cache provided by the Cache class to store the number of messages processed and the total line length. You should be able to stop and restart the processor pipeline while maintaining ongoing state.

Consumer Pipeline

The consumer pipeline consumes messages from the 'topic2' pipeline and writes output to STDOUT.

Aggregator Pipeline

The aggregator pipeline consumes messages from the 'topic1' pipeline and counts the occurences of messages (grouped by message content).

Pipeline DSL

| Pipeline Step | Description | |-----------|---------------|-------------| | fromTopic | Consume messages from Kafka topic| | toTopic | Produce messages to Kafka topic | | fromReadable | Consume messages from a generic NodeJS Readable stream| | toWritable | Produce messages to a generic NodeJS Writable stream | | map | transform a message via a map function | | filter | filter messages using a predicate function | | aggregate | performe windowed aggregation | | pipe | transform a message via a generic NodeJS Transform |

Stateful Cache

The Cache class provided by this library is a Kafka-backed in-memory cache. It is designed to allow a stateful processing microservice to be restarted and continue from where it left off. The cache provides the usual get and set methods for storing key/value pairs. Keys should be strings and values can be any Javascript type.

Cache updates and deletions are persisted to a Kafka topic as a change stream. The topic will be created automatically if it doesn't already exists. Log compaction is enabled by default so that, logically, only the latest value for each key is retained. On initialization, the persisted change stream is fully consumed to reinstantiate the state of the cache. You should wait for initialization to finish before starting your processing pipeline.