jupyter-ijavascript-utils
v1.50.0
Published
Utilities for working with iJavaScript - a Jupyter Kernel
Downloads
189
Readme
Overview
This is a library to help people that understand JavaScript to leverage for using Jupyter with the iJavaScript kernel as a way to load and explore data, and ultimately tell compelling stories with visuals.
Notebooks are a way to explore and experiment, in addition to write and explain ideas.
All of the tutorials provided here, including this one, was written as a notebook and simply exported.
Documentation
See documentation at: https://jupyter-ijavascript-utils.onrender.com/
What is this?
The jupyter-ijavascript-utils library is simply a collection of utility methods for Node and JavaScript Developers interested in Data Science.
- Load
- Aggregate
- Manipulate
- Format / Visualize
- Refine and Explore
- Export
Currently, we assume you'll be using nriesco's iJavaScript Jupyter Kernel and the Jupyter Lab - the latest interface for Jupyter - and the installation is fairly simple in the How to Use guide. (Although suggestions welcome)
This is not intended to be the only way to accomplish many of these tasks, and alternatives are mentioned in the documentation as available.
What's New
- 1.50 - added in color/colour - and addressed issue #65
- 1.49 - Additional documentation for issues #18, #63, #62, #61 (like table.generateObjectCollection)
- 1.48 - Correct table rendering html if filter was used (#46)
- 1.47 -
- allow conversion from a Collection of Objects to Arrays and back with object. (objectCollectionFromArray, objectCollectionToArray)
- Easier iterating over values and peeking with array with PeekableArrayIterator
- Support for delayed and asynchronous chaining of functions (array.delayedFn, chainFunction, etc)
- 1.46 - Make it easier to extract data from "hard-spaced arrays" - (array.multiLineSubstr, array.multiStepReduce)
- 1.45 - more ways to understand the data - aggregate.coalesce(), convert properties to arrow/dot notation / reverse it : object.flatten() / object.expand() and Object.isObject()
- 1.43 - esm module fix since still not supported yet in ijavascript
- 1.41 - object.propertyInherit - to simplify inheriting values from one record to the next
- 1.40 - array.extract and array.applyArrayValue to allow for extracting values from arrays, transforming them on a separate process and applying them back
- 1.39 - format.exportWords - to identify distinct words in strings using unicode character properties
- 1.38 - object.extract / object.apply
- 1.37 - {@link module:format.replaceString|format.replaceString} as convenience for replacing only a single string.
- 1.36 - replaceStrings to allow for replacement dictionaries and tuplets
- 1.35 - object.extractObjectProperties / object.extractObjectProperty - to do horizontal transposes on objects
- 1.34 - format.mapArrayDomain and add notes in to random on using non-uniform distributions.
- 1.33 - Object.augmentInherit and Object.union
- 1.32 - Array.indexify to identify sections within a 1d array into a hierarchy.
- 1.31 - harden Array.transpose for arrays with nulls, and Table.generateTSV
- 1.30 - add Format.wordWrap and Format.lineCount
- 1.29 - Updated TableGenerator.format method
- 1.28 - Sticky table headers for table.render
- 1.27 - Multi-Dimensional arange (initialize array along multiple dimensions)
- 1.26 - Support for file.writeFile and file.writeJSON to append
- 1.25 - Additional chain methods and documentation
- 1.24 - format.stripHtmlTags, TableGenerator.offset, chain.chainFlatMap, chain.chainFilter
- 1.23 - add format.parseNumber and TableGenerator.styleColumn, align group.separateByFields to vega-lite fold transform
- 1.22 - make chain iJavaScript aware, but still able to work outside of Jupyter
- 1.21 - include chain - simple monoid
- 1.20 - fix vega dependency
- 1.19 - add in describe and hashMap modules, along with format.limitLines
- 1.18 - tie to vega-datasets avoiding esmodules until ijavascript can support them
- 1.17 - provide object.propertyValueSample - as a way to list 'non-empty' property values
- 1.16 - provide file.matchFiles - as a way to find files or directories
- 1.15 - provide object.formatProperties - as a way to quickly convert to string, number, etc.
- 1.14 - provide format.compactNumber and object.mapProperties
- 1.13 - provide utils.random() to genrate random values
- 1.12 - provide
utils.table(...)
instead ofnew utils.TableGenerator(...)
- 1.11 - provide topValues (like top 5, bottom 3)
- 1.10 - provide percentile (like 50th percentile) aggregates
- 1.9 - allow transposing results on TableGenerator.
- 1.8 - add in What can I Do tutorial, and object.join methods
- 1.7 - revamp of
animation
method to htmlScript - 1.6 - add SVG support for rendering SVGs and animations
- 1.5 - Add Latex support for rendering Math formulas and PlantUML support for Diagrams
- 1.4 - Add in vega embed, vega mimetypes and example choropleth tutorial
- 1.3 - Add Leaflet for Maps, allow Vega to use explicit specs (so Examples can be copied and pasted, and add in htmlScripts
Running Your Own Notebooks
A many of the tutorials are simply exports of Jupyter notebooks (*.ipynb), found under the docResources/notebooks folder.
(Note that if you wish to require
additional packages - like jupyter-ijavascript-utils
,
simply create a package in the folder you will run the jupyter lab
command
- such as the sample one under docResources/notebooks/package.json)
ESM Modules + D3
Note that we strongly recommend using this with other modules like D3 - that only support ESM modules now.
There is a known issue #210 in the iJavaScript kernel.
So if you try to import libraries like d3 and get comments like this
$ node -e "import defaultExport from './test.mjs'"
[eval]:1
import defaultExport from './test.mjs'
^^^^^^
SyntaxError: Cannot use import statement outside a module
at new Script (vm.js:88:7)
at createScript (vm.js:263:10)
at Object.runInThisContext (vm.js:311:10)
at Object.<anonymous> ([eval]-wrapper:10:26)
at Module._compile (internal/modules/cjs/loader.js:1151:30)
at evalScript (internal/process/execution.js:94:25)
at internal/main/eval_string.js:23:3
Use esm-hook as a workaround for now.
require("esm-hook"); // must come before requiring esm modules
d3 = require('d3'); // import esm modules
More is found on the documentation for issue #210
Google Collab
You can very easily use iJavaScript and the jupyter-ijavascript-utils within Google Collab.
See the excellent writeup from Alexey Okhimenko
And the shortlink to run your own: https://tinyurl.com/tf-js-colab
Steps Overview:
- Clone the Google Collab Document
- Run the first cell
- If you notice an error
unrecognized runtime "JavaScript"
- that's expected
- If you notice an error
- Refresh your browser (see Alexey's writeup for more)
- Run a cell to install modules, ex:
sh('npm install jupyter-ijavascript-utils')
- Then continue to run your code past this point.
Running on Binder
mybinder.org is a great place to run a Jupyter Notebook online.
It means you can run Jupyter Notebooks with additional kernels without having to install anything, and can try right in your browser.
For Example
Get Sample Data
(See the DataSets module for more on sample datasets)
(See the ijs module for helpers to use async/await)
//-- get the data
utils.ijs.await(async ($$, console) => {
barley = await utils.datasets.fetch('barley.json');
//-- continue to use the barley dataset, or wait to the next cell
});
Group By
Then we can group using a process similar to d3js
(See the Group Module for more on grouping)
//-- get the min max of the types of barley
barleyByVarietySite = utils.group.by(barley, 'variety', 'site')
// SourceMap(10) [Map] {
// 'Manchuria' => SourceMap(6) [Map] {
// 'University Farm' => [ [Object], [Object] ],
// 'Waseca' => [ [Object], [Object] ],
// 'Morris' => [ [Object], [Object] ],
// 'Crookston' => [ [Object], [Object] ],
// 'Grand Rapids' => [ [Object], [Object] ],
// 'Duluth' => [ [Object], [Object] ],
// source: 'site'
// },
// 'Glabron' => SourceMap(6) [Map] {
// 'University Farm' => [ [Object], [Object] ],
// 'Waseca' => [ [Object], [Object] ],
// 'Morris' => [ [Object], [Object] ],
// 'Crookston' => [ [Object], [Object] ],
// 'Grand Rapids' => [ [Object], [Object] ],
// 'Duluth' => [ [Object], [Object] ],
// source: 'site'
// },
// ...
// }
//-- now group by variety and year
barleyByVarietyYear = utils.group.by(barley, 'variety', 'year')
// SourceMap(10) [Map] {
// 'Manchuria' => SourceMap(2) [Map] {
// 1931 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// 1932 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// source: 'year'
// },
// 'Glabron' => SourceMap(2) [Map] {
// 1931 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// 1932 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// source: 'year'
// },
// ...
// }
Aggregating
(See the Aggregation module for more)
utils.group.by(barley, 'variety', 'site')
.reduce((collection) => ({
years: utils.aggregate.extent(collection, 'year'),
numRecords: utils.aggregate.length(collection),
yield_sum: utils.aggregate.sum(collection, 'yield'),
yield_min: utils.aggregate.min(collection, 'yield'),
yield_max: utils.aggregate.max(collection, 'yield'),
yield_diff: utils.aggregate.difference(collection, 'yield')
}));
returns
[
{
variety: 'Manchuria',
site: 'University Farm',
years: { min: 1931, max: 1932 },
numRecords: 2,
yield_sum: 53.9,
yield_min: 26.9,
yield_max: 27,
yield_diff: 0.100
},
{
variety: 'Manchuria',
site: 'Waseca',
years: { min: 1931, max: 1932 },
numRecords: 2,
yield_sum: 82.33333,
yield_min: 33.46667,
yield_max: 48.86667,
yield_diff: 15.39999
},
...
];
Render as a Table
(See the TableGenerator class for more)
new utils.TableGenerator(barley)
.sort('-yield')
.formatter({ year: (v) => `${v}`})
.limit(10)
.render()
Show a Graph
(See the {@tutorial vegaLite1} tutorial or the {@link module:vega|Vega module} for more)
//-- make a point chart
utils.vega.svg((vl) => vl.markPoint()
//-- data as an array of items
.data(barley)
.title('Barley Yield by Site')
.width(600)
.encode(
//-- x position is Nominal - not a number
vl.x().fieldN('site'),
//-- y position is Quantitative - a number
vl.y().fieldQ('yield'),
//-- Color is based on the year field
vl.color().fieldN('year')
)
)
Where making it into a bar chart, to understand the proportions of varieties grown is simply changing the mark type
// change from markPoint to markBar
utils.vega.svg((vl) => vl.markBar()
//-- data as an array of items
.data(barley)
.title('Barley Yield by Site Variety')
.width(600)
.encode(
//-- x position is Nominal - not a number
vl.x().fieldN('site').title('Site'),
//-- y position is Quantitative - a number
vl.y().fieldQ('yield').title('Yield'),
//-- Color is based on the variety field
vl.color().fieldN('variety').title('Variety')
)
)
With further options to zoom, pan, or setup interactive sliders:
Or try your hand at the Vega Lite Examples
Create a Data Driven Map
(See the Data Driven Maps Tutorial for More)
Render Maps
(See the Leaflet module for more)
Generate Text Driven Diagrams
(See the PlantUML module for more)
Render Other Libraries
(See the Html Script Tutorial for more)
utils.ijs.htmlScript({
scripts: ['https://cdnjs.cloudflare.com/ajax/libs/qrcodejs/1.0.0/qrcode.min.js'],
height: '100%',
onReady: ({rootEl}) => {
new QRCode(rootEl, "https://jupyter-ijavascript-utils.onrender.com/");
}
});
Create Animations
(See the Noise Visualization Tutorial or SVG Module for more)
(click the image to play the gif animation)
License
See License (MIT License).
Issues
If you have any questions first file it on issues before contacting authors.
Toubleshooting
iJavaScript does not currently support AMD Modules, due to open issues with nodejs see iJavaScript issue #210 for more
Contributions
Your contributions are welcome: both by reporting issues on GitHub issues or pull-requesting patches.
If you want to implement any additional features, to be added to JSforce to our master branch, which may or may not be merged please first check current opening issues with milestones and confirm whether the feature is on road map or not.
If your feature implementation is brand-new or fixing unsupposed bugs in the library's test cases, please include addtional test codes in the src/__tests__/
directory.