@nqminds/crop-doc-proc-databot
v2.2.2
Published
A databot class for the crop doc data processing chain
Downloads
11
Readme
@nqminds/crop-doc-proc-databot
This package provides a base class that crop doc process databots can extend to provide several utility functions, error handling, and logging.
installation and usage
npm i @nqminds/crop-doc-proc-databot
In the entry point of your databot:
const input = require("@nqminds/nqm-databot-utils").input;
const MyDatabot = require("./path/to/databot");
const databot = function (input, output, context) {
const myDatabotInstance = new MyDatabot(input, output, context);
myDatabotInstance.start();
};
input.pipe(databot);
In your databot code:
const ProcessDatabot = require("@nqminds/crop-doc-proc-databot");
class MyDatabot extends ProcessDatabot {
async main() {
// Code of databot
}
}
Package parameters
Databots that extend process databot must provide the following data in their packageParams included on context.
{
"name": "string", // The unique name of this databot definition
"manifest": [
// See manifest below for more details
{
"inputName": "string",
"inputType": "string",
"ttl": "number",
"timeKey": "string",
"owner": "string"
}
],
// python installation options (pick one only)
pythonPackages: ["string"], // names of python3 packages to install
condaEnv: "string" // path of a conda environment.yml
usePoetry: "bool" // install using poetry (pyproject.toml must be defined, poetry.lock optional)
javaSubProcess: { // optional
port: "number", // The port on which to launch the websocket server for communication
executablePath: "string" // The location of the java executable to run
}
}
Manifest
The manifest for a process databot details information on the inputs required for the databot to run. Before a process databot begins it will verify that all of these inputs are available.
{
inputName: "input1", // A unique name for this input
inputType: "geotiff", // The type of input, this will determine how the input is loaded by the databot
// Usually one of "geotiff", "dataset"
ttl: 60, // The maximum age of the most recent input for it to be considered valid in minutes
timeKey: "timestamp", // If loading a dataset, the field name containing the creation time for the record
owner: "[email protected]", // Email address of the party responsible for this input
}
// Dataset exmaple with time sensitivity
{inputName: "dataInputA", inputType: "dataset", ttl: 60, timeKey: "timestamp", owner: "[email protected]"}
// Geotiff Example
{inputName: "dataInput", inputType: "geotiff", ttl: 60, owner: "[email protected]"}
An input takes the form of a resource on the TDX, inputs should be tagged with both their input name and their input type. Resource creation time will be used for verifying TTL.
Conda environment
To specifiy your desired python environment, create an environment.yml
with your dependencies. Below is what a YAML environment file might look like:
channels:
- conda-forge
- defaults
- mro
dependencies:
- python=3.7.*
- scikit-learn=0.20.*
- scipy=1.2.*
- matplotlib=3.0.*
- pandas=0.24.*
- pymongo=3.7.*
- pytest=4.4.*
- pip=19.0.*
- pip:
- pytest-mpl==0.10.*
Communication with Java
If the javaSubProcess option was set in packageParams then the process databot will instatiate an instance of the java communicator class. This can be used to communicate with a java process (see java stub package for an example of java code). Usage is as follows:
this.javaCom.on("ready", () => { // The java process is ready to receive inputs
this.javaCom.sendData([{inputType: "file", path: "home"}]); // Send a json array of input values
});
this.javaCom.on("data", (data) => {
// Do something with the data received from java
});
this.javaCom.on("end", (code) => {
// The java code has disconneted from the web socket
});
functionality
API Reference
process-databot
ProcessDatabot ⏏
Process databot base class
Kind: Exported class
processDatabot.getDatasetId(dataset) ⇒ string
Returns the dataset id of a resource created by the app
Kind: instance method of ProcessDatabot
Returns: string - datasetId
| Param | Type | Description | | ------- | ------------------- | ---------------------------------------------------- | | dataset | string | Id of the schema for the dataset (e.g. serviceUsers) |
processDatabot.main()
The main function of this databot.
You must override this function in your own databot class,
as it will be called by start()
.
Kind: instance method of ProcessDatabot
processDatabot.start()
Starts the databot
Kind: instance method of ProcessDatabot
processDatabot.log(message)
Adds a timestamped message to the process log
Kind: instance method of ProcessDatabot
| Param | Type | Description | | ------- | ------------------- | --------------------------- | | message | string | Text content of the message |
processDatabot.installPythonPackages()
Installs required python3 packages
Kind: instance method of ProcessDatabot
processDatabot.python(scriptName, ...args)
Runs a python script using python3 and given arguments
Kind: instance method of ProcessDatabot
| Param | Type | Description | | ---------- | ------------------- | ------------------------------------------------------------------------------------------------ | | scriptName | string | The name of the python script, the code will look for this file in the current working directory | | ...args | any | Arguments to pass to the python script |
processDatabot.finish()
Called when the databot exits
Kind: instance method of ProcessDatabot