node-red-contrib-sparkml

    1.0.0 • Public • Published

    node-red-contrib-sparkml

    This is a Node-RED extension pack and contains a set of nodes which offer Spark Dataframe, SQL and machine learning functionalities. All nodes have a python/pyspark core.

    Allows Drag & Drop Machine Learning with Spark. Provides Visual Interface.

    Features

    Drag Drop Spark ML

    Functionalities

    This project is a WIP, and I am planning to add more nodes - as many as are available in Spark Transformers and Estimators.

    Feature Extractors

    • TF-IDF
    • Word2Vec
    • CountVectorizer
    • FeatureHasher

    Feature Transformers

    • Tokenizer
    • StopWordsRemover
    • n-gram
    • Binarizer
    • PCA
    • StringIndexer
    • IndexToString
    • OneHotEncoderEstimator
    • VectorIndexer
    • SQLTransformer
    • VectorAssembler

    Classification Algorithms

    • Decision Tree Classifier
    • Logistic Regression
    • Gradient-boosted Tree Classifier
    • Multilayer Perceptron
    • Random Forest Classifier
    • Support Vector Machines
    • k-Nearest Neighbour Classifier

    Clustering Algorithms

    • K-Means Clustering
    • Latent Dirichlet allocation (LDA)

    Pre requisites

    Be sure to have a working installation of Node-RED.
    Install python and the following libraries:

    • Python 3.6.4 or higher accessible by the command 'python' (on linux 'python3')
    • PySpark

    Install

    To install the latest version use the Menu - Manage palette option and search for node-red-contrib-sparkml, or run the following command in your Node-RED user directory (typically ~/.node-red):

    npm i node-red-contrib-sparkml
    

    Usage

    These flows create a dataset, train a model and then evaluate it. Models, after training, can be use in real scenarios to make predictions.

    There is an example flow and a test dataset available in the 'test' folder.

    Tip: You can run 'node-red' (or 'sudo node-red' if you are using linux/mac) from the folder '.node-red/node-modules/node-red-contrib-sparkml' to avoid confusion.

    Example Deployment Deployment

    Contributors Welcome

    I am looking for contributors! Feel free to open issues directly on github or email me for any questions, suggesting features or general feedback!

    Install

    npm i node-red-contrib-sparkml

    DownloadsWeekly Downloads

    4

    Version

    1.0.0

    License

    ISC

    Unpacked Size

    1.44 MB

    Total Files

    46

    Last publish

    Collaborators

    • alivcor