node-red-contrib-sparkml
v1.0.0
Published
NodeRED Extension Pack for SparkML
Downloads
7
Maintainers
Readme
node-red-contrib-sparkml
This is a Node-RED extension pack and contains a set of nodes which offer Spark Dataframe, SQL and machine learning functionalities. All nodes have a python/pyspark core.
Allows Drag & Drop Machine Learning with Spark. Provides Visual Interface.
Features
Functionalities
This project is a WIP, and I am planning to add more nodes - as many as are available in Spark Transformers and Estimators.
Feature Extractors
- [x] TF-IDF
- [ ] Word2Vec
- [x] CountVectorizer
- [ ] FeatureHasher
Feature Transformers
- [x] Tokenizer
- [ ] StopWordsRemover
- [ ] n-gram
- [ ] Binarizer
- [ ] PCA
- [x] StringIndexer
- [ ] IndexToString
- [ ] OneHotEncoderEstimator
- [x] VectorIndexer
- [x] SQLTransformer
- [x] VectorAssembler
Classification Algorithms
- [x] Decision Tree Classifier
- [x] Logistic Regression
- [x] Gradient-boosted Tree Classifier
- [x] Multilayer Perceptron
- [x] Random Forest Classifier
- [ ] Support Vector Machines
- [ ] k-Nearest Neighbour Classifier
Clustering Algorithms
- [ ] K-Means Clustering
- [ ] Latent Dirichlet allocation (LDA)
Pre requisites
Be sure to have a working installation of Node-RED.
Install python and the following libraries:
Install
To install the latest version use the Menu - Manage palette option and search for node-red-contrib-sparkml, or run the following command in your Node-RED user directory (typically ~/.node-red
):
npm i node-red-contrib-sparkml
Usage
These flows create a dataset, train a model and then evaluate it. Models, after training, can be use in real scenarios to make predictions.
There is an example flow and a test dataset available in the 'test' folder.
Tip: You can run 'node-red' (or 'sudo node-red' if you are using linux/mac) from the folder '.node-red/node-modules/node-red-contrib-sparkml' to avoid confusion.
Example Deployment
Contributors Welcome
I am looking for contributors! Feel free to open issues directly on github or email me for any questions, suggesting features or general feedback!