yakka

v0.0.0

Published

10 months ago

Refinery is a python library and platform for building data pipelines that clean datasets and train ML models with human supervision and feedback.

Downloads

0High
0Medium
0Low

sam-goodwin

Refinery

Refinery is a python library and platform for building data pipelines that clean datasets and train ML models with human supervision and feedback.

It automatically provisions all required infrastructure and guarantees a least-privilege and privacy compliant data architecture.

Features

Train transformation functions (using AI) that are supervised by humans and continually improved with feedback and corrections.
Orchestrate transformation with dependency graphs (DAGs)
Compute data sets when new data arrives or when its dependencies change
Re-compute data sets when a transformation function is changed or improves from learning
Auto-provision all required cloud infrastructure
Auto-configured to be compliant with privacy regulations such as HIPAA and GDPR
Least-privilege IAM policies with auto-generated reports for regulators

Example

🔧 Note: Refinery is in active development. Not all features are implemented. Check back to see the following example grow.

Below is the most simple Refinery application: a Bucket with a Function that writes to it.

Your application's infrastructure is declared in code. The Refinery compiler analyzes it to auto-provision cloud resources (in this case AWS S3 Bucket and Lambda Function) with least privilege IAM Policy inference.

from refinery import Bucket, function

videos = Bucket("videos")

@function()
async def upload_video():
    await videos.put("key", "value")

@asset()
async def transcribed_videos():
  ...

Research

Inspired by (and integrating with):

[ ] https://dagster.io/
[ ] https://www.llamaindex.ai/
[ ] https://unstructured.io/
[ ] https://docs.modular.com/mojo/roadmap.html

Naming Options

Smelt is available on Pip
Refinery is not available on NPM or Pip
I maybe have access to alchemy on NPM but it's taken on PIP

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Refinery

Features

Example

Research

Naming Options