cz-pipcook-boa-no-leak
v2.2.1
Published
Use Python modules seamlessly in Node.js
Downloads
3
Readme
Introduction to Boa
Boa is the Python Bridge Layer in Pipcook, it lets you call Python functions seamlessly in Node.js, it delivers any Python module for Node.js developer in lower-cost to learn or use.
Quick Start
Install Boa from npm:
$ npm install @pipcook/boa
Let's have a glance on how to call to Python's function:
const boa = require('@pipcook/boa');
const os = boa.import('os');
console.log(os.getpid()); // prints the pid from python.
// using keyword arguments namely `kwargs`
os.makedirs('..', boa.kwargs({
mode: 0x777,
exist_ok: false,
}));
// using bult-in functions
const { range, len } = boa.builtins();
const list = range(0, 10); // create a range array
console.log(len(list)); // 10
console.log(list[2]); // 2
Install Python Package
By default, Boa will install a conda virtual environment under the path of the Boa package. To make it easier to install python libraries, you can run:
$ ./node_modules/.bin/bip install <package-name>
bip
is an alias of pip that points to the correct Python environment.
API References
A Connection between 2 languages(ecosystems) has huge works to be done, even though this package is working only on the unilateral from Python to JavaScript. The most difficult part is that for developers, they need to understand the correspondence between the two languages and ecosystems. Therefore, a good design principle will make developers reduce learning costs.
boa
require('@pipcook/boa')
returns the root object, which will be your entry point to all Python functions, and it provides these methods:
.builtins()
Gets the Python's built-in functions, for example:
const { len, range } = boa.builtins();
len([1, 2, 3]); // 3
len(range(0, 10)); // 10
.import(mod)
Imports a Python module in your current environment, the module includes:
- system modules like
os
,string
andre
. - third-party modules like
numpy
andrequest
via pip.
To call the function, you should pass a mod
argument for the module that you want to import.
const os = boa.import('os');
const str = boa.import('string');
const numpy = boa.import('numpy');
This returns an instance of PythonObjectWrapper
or a JavaScript primitive value for some special cases.
.kwargs(map)
Creates a Python's keyword arguments, Python provides a way to map arguments with names:
fs.open('./a-file-to-open', mode=0e777)
Correspondingly, this function is used to represent a keyword arguments, and the specific usage is very easy to understand:
const fs = boa.import('fs');
fs.open('./a-file-to-open', boa.kwargs({ mode: 0e777 }));
.with(ctx, fn)
It's equivalent to the with-statement in Python, this would be called with an object ctx
that supports the context management protocol (that is, has __enter__()
and __exit__()
methods). And 2nd fn
is corresponding to the execution block, A simple example is as follows:
boa.with(localcontext(), (ctx) => {
// execution
// the ctx is localcontext().__enter().
});
.eval(str)
Execute Python expression in the context specified, here is a simple call:
boa.eval('len([10, 20])');
// 2
Alternatively, developers can use tagged template literal to pass variables that have been defined in JavaScript:
const elem = np.array([[1, 2, 3], [4, 5, 6]], np.int32);
boa.eval`${elem} + 100`; // do matrix + operation
boa.eval`len(${elem})`; // equivalent to `len(elem)`
For multi-line code, Python 3 does not provide a mechanism to return a value, so the eval
function can only handle single-line Python expressions.
.bytes(str)
A shortcut to create a Python's bytes literal from JavaScript string, it's equivalent to b'foobar'
in Python.
const { bytes } = boa;
bytes('foobar'); // "b'foobar'"
The bytes(str)
function simply creates a plain object that is used to pass a string to a Python function as a bytes literal, but does not correspond to any Python object itself. Alternatively, you could use Python's builtin class bytes
for creating a real object:
const { bytes } = boa.builtins();
const foobar = Buffer.from('foobar');
bytes.fromhex(foobar.toString('hex'));
// "b'foobar'"
Class PythonObjectWrapper
This class represents a wrapper for the corresponding object in Python runtime, it must be returned only from boa
methods like boa.builtins
and boa.import
.
creation of instance
In order for developers to use Python objects seamlessly, creating a PythonObjectWrapper
requires some necessary steps.
First, check the type of the Python object under instance. If it is one of the following types, it will be converted to the corresponding primitive type.
| python type | primitive |
|---------------|--------------|
| int
,float
| number
|
| int64
| bigint
|
| float64
| bigdecimal
|
| bool
| boolean
|
| str
| string
|
| NoneType
| null
|
If the type of the object that needs to be wrapped is not in the above primitive, a temporary object will be created, and methods and properties will be defined through Object.defineProperties
.
On an instance of PythonObjectWrapper
, developers can directly obtain values through the property way, just like using those in Python. This is because we use ES6 Proxy, so the last step, we created a Proxy
Object, configured with 3 trap handlers, get
, set
, apply
, and finally returns this proxy object.
property accessor
At Python language, an object has attr and item accessors, and they use different expressions:
x.y
is for attr accessorm[n]
is for item accessor
Unfortunately, ES6 Proxy does not distinguish the above things. Therefore, it's needed to define an algorithm to confirm their priority in a time of operation.
- given a
name
variable which is passed by ES6 Proxy'sget
handler. - check the
name
property is owned by the JavaScript object via.hasOwnProperty()
.- return the property if it's truthy.
- check the
name
property is owned by the object's class via.constructor.prototype.hasOwnProperty()
.- return the property if it's truthy.
- check if
name
is a numeric representation.- if it's truthy, call the internal method
.__getitem__(i)
for item accessor. - otherwise
- try to access the attr via the internal method
.__getattr__()
.- if no exceptions, create the new instance from the value and return.
- try to access the item via the internal method
.__getitem__()
.- if no exceptions, create the new instance from the value and return.
- otherwise, return
undefined
.
- try to access the attr via the internal method
- if it's truthy, call the internal method
To better understand the algorithm above, let's look at some examples:
const boa = require('@pipcook/boa');
const { abs, tuple } = boa.builtins();
{
console.log(abs(-100)); // 100
console.log(abs(100)); // 100
}
{
const re = boa.import('re');
const m = re.search('(?<=abc)def', 'abcdef');
console.log(m.group(0)); // 'def'
}
{
// make sure the `numpy` is in your current python env.
const np = boa.import('numpy');
const x0 = np.array([[1, 2, 3], [4, 5, 6]], np.int32);
const x1 = np.arange(15).reshape(3, 5);
const x2 = np.zeros(tuple([3, 4]));
}
As mentioned above, in addition to dynamically obtaining objects from the Python runtime, the class PythonObjectWrapper
also defines the following public methods built into JavaScript.
.prototype.toString()
Returns a string for representing the object, internally it calls the CPython's PyObject_Str
.
console.log(boa.import('os').toString());
// "<module 'os' from '/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/os.py'>"
.prototype.slice(start, stop, step)
Returns a new wrapped slice object, it's equivalent to s[start:stop:step]
. For example:
const { range } = boa.builtins();
const mlist = range(0, 10); // [0...10]
const s = mlist.slice(2, 10, 1); // [2...10]
Note: a new tc39 proposal slice notation attempts to add this kind of syntax, it'll be merged when it's land on v8 engine. Or try with
eval
in Python's syntax:boa.eval`${mlist}[0...10]` boa.eval`${mlist}[1:10:2]`
.prototype.__hash__()
Returns the hash value of this object, internally it calls the CPython's PyObject_Hash
.
Magic methods there are some others like
__getitem__
,__setitem__
,__getattr__
,__setattr__
which are used internally in this library, it's not recommended to actively use them at user-land.
.prototype[Symbol.toPrimitive](hint)
Returns a corresponding primitive value for this object, see Symbol.toPrimitive
on MDN for more details.
Working with ECMAScript Modules
Requires Node.js >=
v12.11.1
Use Node.js custom loader for better import-statement.
// app.mjs
import { range, len } from 'py:builtins';
import os from 'py:os';
import {
array as NumpyArray,
int32 as NumpyInt32,
} from 'py:numpy';
console.log(os.getpid()); // prints the pid from python.
const list = range(0, 10); // create a range array
console.log(len(list)); // 10
console.log(list[2]); // 2
const arr = NumpyArray([1, 2, 3], NumpyInt32); // Create an array of int32 using ndarry constructor
console.log(arr[0]); // 1
In Node.js v14.x
you can specify only --experimental-loader
to launch your application:
$ node --experimental-loader @pipcook/boa/esm/loader.mjs app.mjs
In Node.js version < v14.x
, you also need to add the --experimental-modules
option:
$ node --experimental-modules --experimental-loader @pipcook/boa/esm/loader.mjs app.mjs
Generators
The package is able to handle the Python generator in JavaScript directly:
# Write a Python Generator in count_down.py
def count_down(count):
while count >= 0:
yield count
count -= 1
The above code will take a number and keep yielding and decreasing the value until 0.
const boa = require('@pipcook/boa')
const countDown = boa.import('count_down')
const generator = countDown(3); // Generator
// You can use typical next syntax
let curr = generator.next();
while (curr.done) {
console.log(curr.value); // 3 2 1 0
curr = generator.next()
}
Or use the syntactic suger:
const boa = require('@pipcook/boa')
const countDown = boa.import('count_down')
const generator = countDown(3); // Generator
// Or access data via the following syntax
for (const element of generator) {
console.log(element) // 3 2 1 0
}
Python functions in worker_threads
The @pipcook/boa
package calls Python function in blocking way, which is because of the Python(CPython)'s object model is not thread-safe, thus Python limits to run Python functions in different threads.
However the @pipcook/boa
package allows running Python function in another thread with Node.js worker_threads
, an non-blocking example is:
const { Worker, isMainThread, workerData, parentPort } = require('worker_threads');
const boa = require('@pipcook/boa');
const pybasic = boa.import('tests.base.basic'); // a Python example
const { SharedPythonObject, symbols } = boa;
class Foobar extends pybasic.Foobar {
hellomsg(x) {
return `hello <${x}> on ${this.test}(${this.count})`;
}
}
if (isMainThread) {
const foo = new Foobar();
const worker = new Worker(__filename, {
workerData: {
foo: new SharedPythonObject(foo),
},
});
let alive = setInterval(() => {
const ownership = foo[symbols.GetOwnershipSymbol]();
console.log(`ownership should be ${expectedOwnership}.`);
}, 1000);
worker.on('message', state => {
if (state === 'done') {
console.log('task is completed');
setTimeout(() => {
clearInterval(alive);
console.log(foo.ping('x'));
}, 1000);
}
});
} else {
const { foo } = workerData;
console.log(`worker: get an object${foo} and sleep 5s in Python`);
foo.sleep(); // this is a blocking function which is implemented at Python to sleep 1s
console.log('python sleep is done, and sleep in nodejs(thread)');
setTimeout(() => parentPort.postMessage('done'), 1000);
}
In the new sub-thread created by worker_threads
, the @pipcook/boa
won't create the new interrupter, it means all the threads by Node.js shares the same Python interpreter to avoid the Python GIL.
To make sure the thread-safty works, we introduce a SharedPythonObject
class to share the Python objects between threads via the following:
// main thread
const foo = new Foobar(); // Python object
const worker = new Worker(__filename, {
workerData: {
foo: new SharedPythonObject(foo),
},
});
// worker thread
const { workerData } = require('worker_threads');
const boa = require('@pipcook/boa');
console.log(workerData.foo);
The SharedPythonObject
accepts a Python object created by @pipcook/boa
, once created, the original object won't be used util the worker thread exits, this is to make sure the thread-safty of the shared objects, it means an object could only be used by worker or main thread at the same time.
Build from source
# clone this project firstly.
$ npm install
$ npm run build
Verify if the generated library is linked to correct Python version
When buidling finished, use objdump -macho -dylibs-used ./build/Release/boa.node
to check if your linked libs are correct as:
/build/Release/boa.node:
/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/Python (compatibility version 3.7.0, current version 3.7.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
See ./tests for more testing details.
Build Tests
To run the full tests:
$ npm test
Virtual Environment for Python
If you are using virtualenv or conda, you can just set up your system environment PYTHONPATH to point to your site-packages folder. For instance:
$ export PYTHONPATH = /Users/venv/lib/python3.7/site-packages