npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@openfn/runtime

v1.5.1

Published

Job processing runtime.

Downloads

4,328

Readme

Runtime

A runtime for running openfn workflows and reporting on their status. The runtime will return a serializable state object when the workflow is completed.

A single expression can be passed to the runtime, which will be wrapped into a single-node wowrkflow.

The runtime is designed to be wrapped into a Runtime Manager, which is responsible for compiling expressions into executable code, triggering runs (ie calling the runtime), and returning state.

Each job/expression in a workflow must export an array of operations, which will be executed in series.

An operation is a function which takes state as input and returns state (or a promise resolving to state) as output.

run([(state) => state]);

The compiler can be used to convert the javascript-like "job DSL" into an compatible ESM module (the runtime does NOT do this automatically).

Basic Usage

The runtime should be passed a workflow JSOn object, or a single job (as a string or as a module which exports an array of functions).

import { readFile } from 'node:fs/promises';
import run from '@openfn/runtime';

const job = await readFile('expression.js', 'utf8');
const initialState = {};
const options = { logger };
const { data } = await run(source, initialState, options);

See the test folder for more usage examples.

The runtime provides no CLI. Use packages/cli (devtools) for this.

Experimental VM Args

For the runtime to work, the parent process needs --experimental-vm-modules be passed. You may also want to pass --no-warnings to suppress annoying console warnings.

Module Caching

If running in a long-lived process (such as inside ws-worker), the runtime may import cached modules.

This can be a problem for isolation even within the sandbox, because state can be shared by two workflows using the same adaptor. This is a security and stability concern.

To address this, the runtime accepts a cacheKey on the linker options. If set, this will be appended to the linker's imports (ie, top-level job imports). All jobs in the same workflow will use the same cacheKey, so a module is cached between jobs, but NOT between workflows.

Long-running worker processes should pass a unique cache key with each run.

IMPORTANT: This will leak memory, because loaded but "stale" modules will NOT be garbage collected.

It is expected that that long-running runtimes will have some kind of purge functionality to reclaim memory (for example, engine-multi will regulaly burn worker threads)

Execution Plans

The runtime can accept an Execution Plan (or workflow) as an input.

This defines a graph of of jobs (expressions) to run in sequence. Each node in the graph is a job, and contains a set of edges which tell the runtime what to execute next.

The runtime will return the final state when there is nothing left to execute.

An execution plan looks like this:

{
  workflow: {
    jobs: [{
      id: 'a',
      expression: "source or path",
      state: { /* default state */ },
      configuration: { /* credentials */ },
      next: {
        'b': true, // edge to another job
        'c': { condition: "state.data.age > 18", // conditional edge to another job
      }
      adaptor: "common", // it's complicated
    }]
  },
  options: {
    start: 'a',
  }
}

State and start node can be passed to the runtime as inputs.

If no start node is provided, the first job in the jobs array will run first.

Ids are technically optional, but needed if an edge (or start) wants to refer to a node.

The runtime itself does not use the adaptor key, as it expects jobs to be compiled with imports. As with expressions, it's the runtime manager's job to compile expressions and ensure dependencies are available.

See src/types.ts for a full definition of an execution plan, and test/runtime.test.ts for examples.

At the time of writing, exectuion plans have some restrictions:

  • Jobs execute in series (but parallisation can be simulated)
  • A job can only have one input node (a -> z <- b is not allowed)
  • Jobs cannot have circular references (a -> b -> a is not allowed)

Support for more complex plans will be introduced later.

Building

To build a js package into dist/, run:

$ pnpm build

To watch and re-build whenever the js changes, run

$ pnpm build:watch

Note: The watch throws an error on first run but seems to work.

You can test or watch tests with:

$ pnpm test
$ pnpm test:watch

Runtime Design

The runtime's job is to take one or more jobs and execute them in series.

Each job, in turn, is a pipeline of operations.

The runtime's repsonsibilities include:

  • Accepting a workflow as JSON or a job as as string
  • Executing all jobs in a safe environment (with some utilities and overrides provided)
  • Ensuring that the state object is not mutated between jobs
  • Emitting lifecycle events for the job pipeline
  • Maintaining a repo of node modules which are available to jobs
  • Loading runtime dependencies from explicit paths (passed in) or a local repo
  • Resolving to a serializable state object

The runtime should not:

  • Compile its input jobs (although it may validate using the compiler)
  • Do any disk I/O
  • Do any thread/process management
  • Auto install any dependencies

These are all the responsibilities of a runtime manager (like the CLI).

Module Loading

When a job calls import to import a dependent module, the runtime must resolve the import statement into executable code.

It does this through a linker function, which takes as arguments a package specifier and vm context, and an options object. It will load the module using a dynamic import and proxy the interface through a vm.SyntheticModules, usng the experimental vm.SourceTextModule API.

Modules can be loaded from:

  • An explicit path (pass as a dictionary of name: path strings into the options)
  • The current working repo (see below)
  • The current working node_modules (should we somehow disallow this?)

The repo is a managed folder which the runtime uses to install and load modules from/to. It is just an arbitrary private npm package (ie, a folder containing a package.json and node_modules). Generally, it is expected that linked modules are loaded from this folder.

The runtime is self-managing and won't do any installs itself, that's up to the runtime manager to sort out

A whitelist can be passed (as an array of regexes) to the linker's options, only allow matching modules to be loaded.

Right now, it's expected that the runtime manager (ie the CLI) will manage the installation of dependencies into the repo before executing the runtime. Later, we may allow the runtime to auto-install dependencies from directives at the top of the job source.