npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

node-red-contrib-sparkml

v1.0.0

Published

NodeRED Extension Pack for SparkML

Downloads

7

Readme

node-red-contrib-sparkml

This is a Node-RED extension pack and contains a set of nodes which offer Spark Dataframe, SQL and machine learning functionalities. All nodes have a python/pyspark core.

Allows Drag & Drop Machine Learning with Spark. Provides Visual Interface.

Features

Drag Drop Spark ML

Functionalities

This project is a WIP, and I am planning to add more nodes - as many as are available in Spark Transformers and Estimators.

Feature Extractors

  • [x] TF-IDF
  • [ ] Word2Vec
  • [x] CountVectorizer
  • [ ] FeatureHasher

Feature Transformers

  • [x] Tokenizer
  • [ ] StopWordsRemover
  • [ ] n-gram
  • [ ] Binarizer
  • [ ] PCA
  • [x] StringIndexer
  • [ ] IndexToString
  • [ ] OneHotEncoderEstimator
  • [x] VectorIndexer
  • [x] SQLTransformer
  • [x] VectorAssembler

Classification Algorithms

  • [x] Decision Tree Classifier
  • [x] Logistic Regression
  • [x] Gradient-boosted Tree Classifier
  • [x] Multilayer Perceptron
  • [x] Random Forest Classifier
  • [ ] Support Vector Machines
  • [ ] k-Nearest Neighbour Classifier

Clustering Algorithms

  • [ ] K-Means Clustering
  • [ ] Latent Dirichlet allocation (LDA)

Pre requisites

Be sure to have a working installation of Node-RED.
Install python and the following libraries:

  • Python 3.6.4 or higher accessible by the command 'python' (on linux 'python3')
  • PySpark

Install

To install the latest version use the Menu - Manage palette option and search for node-red-contrib-sparkml, or run the following command in your Node-RED user directory (typically ~/.node-red):

npm i node-red-contrib-sparkml

Usage

These flows create a dataset, train a model and then evaluate it. Models, after training, can be use in real scenarios to make predictions.

There is an example flow and a test dataset available in the 'test' folder.

Tip: You can run 'node-red' (or 'sudo node-red' if you are using linux/mac) from the folder '.node-red/node-modules/node-red-contrib-sparkml' to avoid confusion.

Example Deployment Deployment

Contributors Welcome

I am looking for contributors! Feel free to open issues directly on github or email me for any questions, suggesting features or general feedback!