npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

js-spark

v0.5.1

Published

Distributed calculation / data processing system. Run computation/jobs on 1000+ cores

Downloads

225

Readme

What is JS-Spark

Distributed real time computation/job/work que using JavaScript. JavaScript re imagine of fabulous Apache Spark and Storm projects.

If you know underscore.js or lodash.js you may use of JS-Spark as distributed version of them.

If you know Distributed-RPC systems like storm you will feel like home.

If you ever worked with distributed work que such as Celery, you will find JS-Spark easy to use.

main page computing que

Why

There are no JS tools that can offload your processing to 1000+ CPU. Furthermore exiting tools in other languages, such as Seti@Home, Gearman, requires time expensive setup of server and later setting up/supervising on clients machines.

We want to do better on JS-Spark your clients need just to click on a URL, and on a server side has one line installation (less than 5 min).

Hadoop is quite slow and requires maintaining cluster - we can to do better. Imagine that there's no need to setup expansive cluster/cloud solutions. Use webrowsers! Easily scale to multiple clients. Clients do not need to install anything like Java or other plugins.

Setup in mater of minutes and you are good to go.

Possibilities are endless:

No need to setup expensive cluster. The setup takes 5 min and you are good to go. You can do it on one machine. Even on Raspberry Pi

  • Use as ML tool may process in real time huge streams of data... while all clients still browse their favorite websites

  • Use as Big data analytics. Connect to Hadoop HDFS and process even terabytes of data.

  • Use to safely transfer huge amount of data to remote computers.

  • Use as CDN ... Today most websites runs slower with more clients use them. But using JSpark you can totally reverse this trend. Build websites that run FASTER the more people use them

  • Synchronize data between multiple smart phones.. even in Africa

  • No expensive cluster setup required!

  • Free to use.

How(Getting started with npm)

To add a distributed job cue to anny node app simply:

    npm i --save js-spark

Look for Usage with npm.

Example running multicore jobs in js:

Simple example node multicore jobs

example-js-spark-usage

git clone [email protected]:syzer/example-js-spark-usage.git && cd $_
npm install

Game of life example

distributed-game-of-life

git clone https://github.com/syzer/distributed-game-of-life.git && cd $_
npm install

Example NLP

This example shows how to use one of the Natural Language Processing tools called N-Gram in distributed manner using jsSpark:

Distributed-N-Gram

To if you like to know more about the N-grams please read:

http://en.wikipedia.org/wiki/N-gram

How(Getting started)

Prerequisites: install Node.js, then: install grunt and bower,

sudo npm install -g bower
sudo npm install -g grunt

Install js-spark

npm i --save js-spark
#or use:
git clone [email protected]:syzer/JS-Spark.git && cd $_
npm install

Then run:

    node index & 
    node client
    

Or:

    npm start        
    

After that you may see how the clients do all calculation, and all heavy lifting.

Usage with npm

var core = require('jsSpark')({workers:8});
var jsSpark = core.jsSpark;

jsSpark([20, 30, 40, 50])
    // this is executed on client side
    .map(function addOne(num) {
        return num + 1;
    })
    .reduce(function sumUp(sum, num) {
        return sum + num;
    })
    .thru(function addString(num){
        return "It was a number but I will convert it to " + num; 
    })
    .run()
    .then(function(data) {
        // this is executed on back on server
        console.log(data);
    })

Usage(Examples)

Client side heavy CPU computation(MapReduce)

task = jsSpark([20, 30, 40, 50])
    // this is executed on client side
    .map(function addOne(num) {
        return num + 1;
    })
    .reduce(function sumUp(sum, num) {
        return sum + num;
    })
    .run();

Distributed version of lodash/underscore

jsSpark(_.range(10))
     // https://lodash.com/docs#sortBy
    .add('sortBy', function _sortBy(el) {
        return Math.sin(el);
    })
    .map(function multiplyBy2(el) {
        return el * 2;
    })
    .filter(function remove5and10(el) {
        return el % 5 !== 0;
    })
    // sum of  [ 2, 4, 6, 8, 12, 14, 16, 18 ] => 80
    .reduce(function sumUp(arr, el) {
        return arr + el;
    })
    .run();

Multiple retry and clients elections

If you run calculations via unknown clients is better to recalculate same tasks on different clients:

jsSpark(_.range(10))
    .reduce(function sumUp(sum, num) {
        return sum + num;
    })
    // how many times repeat calculations
    .run({times: 6})
    .then(function whenClientsFinished(data) {
        // may also get 2 most relevant answers
        console.log('Most clients believe that:');
        console.log('Total sum of numbers from 1 to 10 is:', data);
    })
    .catch(function whenClientsArgue(reason) {
        console.log('Most clients could not agree, ', + reason.toString());
    });

Combined usage with server side processing

task3 = task
    .then(function serverSideComputingOfData(data) {
        var basesNumber = data + 21;
        // All your 101 base are belong to us
        console.log('All your ' + basesNumber + ' base are belong to us');
        return basesNumber;
    })
    .catch(function (reason) {
        console.log('Task could not compute ' + reason.toString());
    });

More references

This project is about to reimplemented some nice things from the world of big data, so there are of course some nice resources you can use to dive into the topic:

Running with UI

    git clone [email protected]:syzer/JS-Spark.git && cd $_
    npm install
    grunt build
    grunt serve

To spam more light-weight (headless) clients:

    node client

Required to run UI

  • mongoDB default connection parameters:

  • mongodb://localhost/jssparkui-dev user: 'js-spark', pass: 'js-spark1' install mongo, make sure mongod(mongo service) is running run mongo shell with command:

mongo
use jssparkui-dev
db.createUser({ 
  user: "js-spark",
  pwd: "js-spark1",
  roles: [
    { role: "readWrite", db: "jssparkui-dev" }
  ]
})
  • old mongodb engines can use db.addUser() with same API

  • to run without UI db code is not required!

  • on first run need to seed the db: change option seedDB: false => seedDB: true on ./private/srv/server/config/environment/development.js

Tests

npm test

TODO

  • [ ] remove
  • [ ] service/file -> removed for other module
  • [ ] di -> separate module
  • [ ] bower for js-spark client
  • [ ] config-> merge diferent config files
  • [ ] server/auth -> do we need that?
  • [ ] server/api/jobs -> separate module?
  • [ ] split ui
  • [X] more examples
  • [X] example with cli usage (not daemon)
  • [X] example with using thu
  • [ ] .add() is might be broken... maybe fix or remove