npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

crowdprocess

v0.5.12

Published

Distributed map in CrowdProcess

Downloads

7

Readme

#CrowdProcess

CrowdProcess is a browser-powered distributed computing platform.

This is the easiest entry-level module to use CrowdProcess in node.js (and the browser).

If you're not sure about how CrowdProcess works, you should go through the guide, it should take you less than 5 minutes.

##Install

npm install crowdprocess

##Examples speak louder than words

var credentials = require('./credentials');
var CrowdProcess = require('..')(credentials);

function Run (d) {
  return d*2;
}

var data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

CrowdProcess(data, Run, function (results) {
  console.log('results:', results);
});

Notice any resemblance;

This pretty much covers it. There are more examples using streams.

##More detailed use

Require and Authenticate

var credentials = require('./credentials');
var CrowdProcess = require('..')(credentials);

The credentials object should be something like

{
  "email": "[email protected]",
  "password": "secret"
}

or

{
  "token": "eaa35d67-2aef-4b14-a50e-9d6c13f5012e"
}

If you don't have CrowdProcess account yet, you should get one. It takes less than 20 seconds ;)

CrowdProcess([data, ] Run[, cb]) is a Duplex stream

The CrowdProcess object you get after requiring and authenticating is a "job builder", meaning you can use it to create CrowdProcess Jobs. Just pass it [at least] a Run function and you'll get a Duplex stream:

var job = CrowdProcess(Run);
console.log(job instanceof Duplex) // true

This means you can pipe data do it, and results from it!

Run is the only mandatory parameter.

The input data

Input data must be either an Array of objects or a Readable objectMode stream.

If you choose to use an Array, you must pass it as the first parameter:

var dataArray = [1, 2, 3];
CrowdProcess(dataArray, Run, onResults);

If you choose to use a Readable stream, then you can either pass it as the first parameter:

var dataStream = new Readable();
CrowdProcess(dataStream, Run, onResults);

Or pipe it:

var dataStream = new Readable();
dataStream.pipe(CrowdProcess(Run, onResults));

The program

A program is Javascript code that will run on Web Workers.

In this module, it may be specified as a function or a string. You'll want to specify it as a string if it has external dependencies (like using functions declared outside the Run function), so you can send all functions.

That comes with some restrictions. It must be valid Javascript code and must use only features available in the Worker scope (so there is no DOM, window, document and parent objects). We also took away some dangerous objects like XMLHTTPRequest, WebSocket, localStorage, Worker, and others, for the obvious possible security/abuse issues.

In addition, programs submitted to CrowdProcess must have a Run function. You must declare a function named Run, so that we can execute that in the web workers. Your input data elements will be passed as arguments (each one in a different worker and execution context).

You may have more functions, but there must be at least one Run function that uses whatever you want to use, like this:

function avg (X) { // will be used inside Run
  var N = X.length;
  var sum = 0;
  for (var i = 0; i < N; i++) {
    sum += X[i];
  }
  return sum / N;
}

function Run (X) { // calculate the standard deviation
  var N = X.length;
  var sum = 0;
  var avgX = avg(X); // uses the avg function here
  for (var i = 0; i < N; i++) {
    sum += Math.pow(X[i] - avgX, 2);
  }
  return Math.sqrt(sum/(N-1));
}

[Browserify](http://browserify.org/)ed code will run on CrowdProcess, as long as you use the standalone option set to Run, like --standalone Run. That means you can use lot's of node modules in CrowdProcess!

Getting the results

If you want the results to be buffered, just pass a callback:

var dataArray = [1, 2, 3];
CrowdProcess(dataArray, Run, function (results) {
  console.log('results: ', results);
});

If you don't want them buffered, don't pass the callback:

var dataArray = [1, 2, 3];

var job = CrowdProcess(dataArray, Run);

job.on('data', function (result) {
  console.log('got a result: ', result);
});

job.on('end', function () {
  console.log('got all results!');
});

Listening for errors

If your job yields errors, you should catch them so you can make it better next time:

var dataArray = [1, 2, 3];

var job = CrowdProcess(dataArray, Run);

job.on('error', function (err) {
  console.error('oh no:', err);
});

If you don't listen for errors and they occur, an uncaught exception will be thrown.

Mock mode

Mock mode allows you to use this module without actually creating a Job, sending input data and receiving results to and from CrowdProcess. Everything will happen locally. Whatever you pipe into the stream will be passed to your Run as an argument, and the output will be piped back to you.

It's good if you're not sure you're using this module correctly or just don't want to be computing all the time during development.

To use it, create the stream with an options object, and set mock to true as one of the options, like so:

var dataArray = [1, 2, 3];

function onResults (results) {
  console.log('yay got some results:', results);
}

var job = CrowdProcess({
  data: dataArray,
  program: Run,
  onResults: onResults,
  mock: true
});

It also works with the duplex stream:

var job = CrowdProcess({
  program: Run,
  mock: true
});

dataStream.pipe(job).pipe(resultsStream);

Using a previously created job

When you create a new job, the stream will emit a created event, where you can get the job's id, like this:


var job = CrowdProcess(function Run (d) { return d; });

var myJobsId;

job.on('created', function (id) {
  myJobsId = id; // save it for later
  console.log('a job was created with id:', id);
});

You can store this id and use it later to pipe some more data in and get some more results out, without creating a new job, by passing the id in the id option, like this:

var newJobStream = CrowdProcess({
  id: myJobsId // myJobsId was declared in the previous code block
});

And then use it as you would if it was a new job.

Pipe the results of one job to another job

Yes, you can pipe the results of one job to another, as with any Duplex stream. It's useful to build a job pipeline (images.pipe(resize).pipe(applyBlackAndWhiteFilter).pipe(results)) and if you have a group stream in the middle, you can pipe a map job to a reduce job that will take two or more results! This test serves as an example.

Caveats

  1. The Duplex stream exposed accepts and delivers objectMode streams, so you can't, for instance, pipe directly it to process.stdout. You need to pass them through a stringifier like JSONStream or newline-json

  2. You cannot use null as input data. All streams used in this implementation are Streams 2 and if you have a null inside your input data array or stream, somewhere along the implementation there will be a Readable.push(null), which closes the stream.

##Tests and Examples

If you run npm test, you'll be testing the module in mock mode. If you want to test this module's functionality with the live CrowdProcess platform, you have to create a credentials.json file in the root of this repository with your security credentials, that may be either email and password or token. They're the same object described above, and then run node test/all.js to test the general functionality, and other files you'll find in the test directory.

Don't worry, your credentials.json will be .gitignored.

##Under the hood

This module is basically a wrapper around two modules that deal with job creation and data handling in CrowdProcess. If you want to learn more check the documentation: