tableloader
v0.0.3
Published
DataLoader + SQL Tables, with adapters Knex.js & Kysely
Downloads
9
Maintainers
Readme
TableLoader
DataLoader + Table Associations = TableLoader (From the original creator of Knex.js)
A JavaScript (Node + ESM) library for relational database querying, focused on optimal type-safety, batching (avoiding N+1 queries), consistency, and correctness.
Ships with adapters for both Knex.js & Kysely, and instructions for building or contributing a new adapter if you're using something else to construct your queries.
Aims for queries that are lightweight, simple to modify via filters or scopes, easy to inspect, and with as good or better performance than the equivalent hand written sql queries, due to proper batching and optional column selection API.
It's especially well suited to avoid the N+1 query problem in GraphQL, but is useful in any situations where you'd use a query builder or ORM to handle your select statements.
Table of Contents
- Overview / Guiding Principles
- Getting Started
- API
- Constraining Queries
- Relation Naming Conventions
- Errors
- TODO Before Launch
- Roadmap:
- License
Overview / Guiding Principles
Database as the Source of Truth: The database has a well defined schema, and stable guarantees around uniqueness and identity. Introspecting and pulling this information into our runtime layer gives us better information about what we should expect when querying.
No Magic: It should be simple to trace the logic used to generate a query, and the query should represent what you would have written by hand. No obfuscation of what's happening behind the scenes, logging out the queries, or adding a debug breakpoint and stepping through the code should feel familiar.
Intentional APIs: The API design should be consistent, and large enough to support the most common use cases, but no larger. There should be a single, obvious approach to accomplish a given task, and it shouldn't try to handle every edge case - you can handle those using a query builder or raw SQL.
Making use of Batching: The batch loading pattern is incredibly helpful in reducing the number of queries to the database, but requires a lot of effort to do correctly in all situations. Tableloader aims to address these cases as much as possible.
Favor Readability over Micro-Optimizations:
The difference between a for
loop and a for of
loop is meaningless when compared with the latency of a network request. This library is designed to optimize for reducing latter.
Keep Dependencies to a Minimum:
- DataLoader (Batching & Caching)
- json-stable-stringify (DataLoader instance cache keys)
- Minimist (For the CLI Helper)
Don't try to do everything: Goes hand-in-hand with "intentional APIs". This library doesn't aim to replace a full-featured query builder layer, such as Knex or Kysely. Instead it should compliment it, providing well thought out abstractions for common data-fetching use cases.
How It Works
A TableLoader instance is created with knowledge of both type information and runtime metadata for the columns & keys in the target database:
// Type Info:
const adapter = new Adapter<DBTypes>(); // KnexAdapter or KyselyAdapter
const tableLoader = new TableLoader(adapter, {
// Runtime Info:
// Introspection typically generated via the Tableloader CLI (see section below)
introspection: {
primaryKeys: {
user: "id",
project: "id",
membership: ["project_id", "user_id"],
},
columns: {
user: ["id", "name", "created_at"],
userInfo: ["user_id", "info"],
project: ["id", "slug", "project_name", "created_at", "updated_at"],
membership: ["project_id", "user_id", "role", "created_at"],
},
uniqueKeys: {
project: ["slug"],
},
},
// Written by hand, to represent the relations you wish to support in the API,
// regardless of whether these are true ForeignKey relationships
relations: {
user: (t) => ({
project: t.manyToMany("project", "membership"),
userInfo: t.hasOne("userInfo").orThrow,
}),
project: (t) => ({
user: t.manyToMany("user", "membership"),
}),
},
});
The concrete introspection metadata allows us to validate all of the defined relations, and keep a proper identity cache of all rows returned, via their primary and secondary (unique) keys.
To use TableLoader, we create a new context:
const ctx = tableLoader.makeContext();
The context acts as an identity cache of row results and internal DataLoaders. You'll generally create a new context per request/response, or similar unit of work, such as a queue job.
Each time a table or relation is queried, a DataLoader is either used or created & cached if it doesn't exist yet. Loaders are cached based on the constraints to the query:
const [user1, user2] = await Promise.all([
// Both access the same dataloader, due to having the same filter
ctx.table("user").filter({ status: "active" }).loadPk(1),
ctx.table("user").filter({ status: "active" }).loadPk(2),
//
]);
await Promise.all([
// Again, both access the same dataloader, since they have the same relation
ctx.relation("user:userInfo").loadFor(user1),
ctx.relation("user:userInfo").loadFor(user2),
]);
SQL:
select `user`.* from `user` where `status` = 'active' and id in (1, 2)
select `user_info`.* from `user_info` where `user_id` in (1, 2)
Goals / Non-Goals
- Goals:
- Full type safety while dealing with relations
- Observability: logs all queries executed within a given context, for simple inspection or snapshotting in tests
- Works from a stable introspection of the database, catching obvious association errors at startup rather than runtime
- Hooks to selectively load columns, if the selection set of columns are known ahead of time, such as when executing a GraphQL Document traversal
- Handle composite keys as well as we can, don't make them an afterthought
- For consideration:
- Polymorphic associations. Not strictly against these, just didn't take the time to add them. Feel free to open an issue if you'd like to see them
- Non-Goals:
- Models / Model Files
- Managing any sort of inserts / updates / deletes
Non-Goal explanation:
Models / Model Files
Tools like GraphQL already provide good separation between reads & writes. There's a lot of opinions around what a Model can / should do, and TableLoader aims to steer clear of that. We are concerned with mapping as close to the database as possible, and to focus on wins that address optimal query loading.
Feel free to build a more robust ORM on top of TableLoader and open a PR to link to it here!
Managing any sort of inserts / updates / deletes
Something for this might be added eventually, but it's not in the short term roadmap. Keeping an identity map of tables can be tricky with mutations, and it's often simpler to just do you your logic, and then issue a new tableLoader.makeContext()
Knex / Kysely provide good helpers for dealing with these sorts of things in the query builder directly, and we recommend you just use those.
Getting Started
In order to maintain type-safety both in TypeScript and at runtime, TableLoader requires a bit of initial configuration to get our types synced up with the database definition.
Generate Introspection Metadata
TableLoader requires a introspection, incuding the primary keys, unique keys, and columns for each table in a target database.
export interface IntrospectionShape<DBShape> {
// A list of all columns associated with all tables
tables: {
[K in keyof DBShape]: Arr<keyof DBShape[K]>;
};
// A list of all primary keys associated with the tables
primaryKeys: Partial<{
[K in keyof DBShape]: keyof DBShape[K] | Arr<keyof DBShape[K]>;
}>;
// A list of all non-primary unique keys for a given table.
uniqueKeys: Record<string, Arr<string | string[]>>;
}
Using the CLI
TableLoader comes with a utility for generating this introspection from the CLI:
# Shows info about available options
tableloader --help
tableloader --adapter [knex,kysely] --connnection $DATABASE_URL
If you'd prefer not to use the CLI, and use the programmatic API, the introspectionBuilder
can be imported from tableloader/dist/introspection
, and can be invoked to generate the introspection types at runtime.
See the source for the introspection CLI for an example of how you might create a runtime tool to generate the introspection data.
Generate Types & Create Adapter
With Kysely
Types should be generated using kysely-codegen
import { Kysely /* PostgresDialect, SqliteDialect, etc. */ } from "kysely";
import { AdapterKysely } from "tableloader/adapters/kysely";
// change to wherever the kysely-codegen types are generated, if the output path is specified
import type { DB } from "kysely-codegen";
import { introspection } from "./path/to/introspection";
const kysely = new Kysely<DB>({
// ... config for specific kysely adapter
});
const adapter = new AdapterKysely(kysely, {
// camelCase: true
});
const tableLoader = new TableLoader(adapter, {
introspection,
relations: {
// ... relations defined here
},
});
With Knex
Types should be generated using @tgriesser/schemats
v9+
npm run schemats generate --prettier --skipPrefix knex_ -o $OUTPUT_PATH -c $DATABASE_URL
import knex from "knex";
import { AdapterKnex } from "tableloader/adapters/knex";
import type { DBTables } from "./path/to/schemats-output";
import { introspection } from "./path/to/introspection";
const db = knex({
// ... config for your Knex instance
});
const adapter = new AdapterKnex<DBTables>(db, {
// camelCase: true
});
const tableLoader = new TableLoader(adapter, {
introspection,
relations: {
// ...relation definitions
},
});
API
TableLoader
tableLoader.makeContext()
Creates a new "context", where the context is the entry point for all APIs to query for tables & relations. Typically you'll do this once per-request, in a middleware or similar location where you setup state for your
const ctx = tableLoader.makeContext();
const userOne = await ctx.table("users").loadPk(1);
TableLoaderContext
ctx.table()
Given a table name, exposes the TableApi for querying tables.
Promise.all([
ctx.table("users").loadPk(1),
ctx.table("users").filter({ status: "inactive" }).getRows(),
]);
ctx.relation()
Returns the "relation" api, where the relation name is defined as the concatenation of the table name and object property
ctx.setSelections()
The setSelections
API takes object, containing a mapping of table names, to columns that should be selected for that table. If not specified, the default is to select *
for the table when loaded.
Primary & unique keys are always loaded, regardless of the setSelections
configuration. This is to ensure that we're able to keep proper row identity, and reduce the number of subsequent row lookups.
ctx.setSelections({
users: ["name"],
projects: ["status"],
});
// select id, name from users where id in (?)
const user = await ctx.table("users").loadPk(1);
// select id, slug, status from project where id in (?)
const project = await ctx.table("project").loadPk(1);
ctx.queries
A getter, returning all of the queries issued
TableApi
table.loadPk()
Loads a row for a table by its Primary Key. The introspection metadata is utilized to infer the correct types for the argument.
ctx.table("user").loadPk(1);
If the primary key is composite (multiple columns), these should be provided as an object:
ctx.table("membership").loadPk({ user_id: 1, organization_id: 2 });
The loadPk
utilizes the batching via DataLoader, so multiple parallel calls to loadPk
will be sent in a single query:
// select "user".* from "user" where "id" in (1, 2, 3)
Promise.all([
ctx.table("user").loadPk(1),
ctx.table("user").loadPk(2),
ctx.table("user").loadPk(3),
]);
Returns the LoadedRow
table.loadPkOrThrow()
Like the loadPk method, except it throw a RowNotFoundError
if the record is not found
table.loadPkBatch()
table.loadPkBatchOrThrow()
table.loadPkBatchOrError()
table.loadUniq()
table.loadBy()
table.filter()
table.orderBy()
table.limit()
table.findAll()
table.asConnection()
table.findOne()
table.interpolate()
Takes a plain object, and coerces it into a LoadedRow
for the table, caching it in the
LoadedRow
A LoadedRow
object is a class which wraps the raw database object, providing getters for the individual fields. It should feel functionally similar to the POJO (plain-old JavaScript Object) returned from the query library, however it has a few unique characteristics specific to TableLoader.
Relation Api (one)
relation.loadFor()
relation.loadForBatch()
Relation Api (many)
relation.orderBy()
relation.filter()
relation.scope()
relation.loadFor()
relation.loadOneFor()
relation.loadForBatch()
relation.asConnectionFor()
Returns an object properly implementing the GraphQL Cursor Connections specification, exposing a "edges", "pageInfo", and "nodes" properties.
const connection = ctx.relation("user:posts").asConnectionFor(user, args);
const [edges, pageInfo, nodes] = await Promise.all([
connection.edges,
connection.pageInfo,
connection.nodes,
]);
relation.asPageFor()
Constraining Queries
Filters
Filters add a where
clause to the query.
Scopes
Table Scopes
TableLoader comes with a concept of "scopes", a way to predefine a set of constraints for a query, to be invoked at runtime. Scopes are added to tables, using the adapter's scopes()
on a relation definition, adding the scope options as metadata to the relation:
new TableLoader(adapter, {
introspection,
relations: {
user: (t) => ({
posts: t.hasMany("post"),
}),
},
scopes: adapter.scopes({
post: {
public: (qb) => qb.where("published", true).where("hidden", false),
},
}),
});
Later when using the relation:
ctx.relation("user:posts").scope("public").loadFor(user);
Global Scopes
TableLoader also comes with a concept of "global scopes". Global scopes are defined using the same adapter.scopes
method, and they are automatically used whenever a table is accessed:
new TableLoader(adapter, {
// ...
globalScopes: adapter.scopes({
project: {
notDeleted: (qb) => qb.whereNull("deleted_at"),
},
}),
});
It's possible to remove global scopes, by calling withougGlobalScopes
:
ctx.table("project").withoutGlobalScopes().getRows();
Scopes with arguments
Scopes can define arguments:
scopes: {
user: adapter.scopes({
active: (qb, excludeUsers: number[]) => {
return qb.where("activated", true).andWhereNotIn("id", excludeUsers);
},
});
}
TableLoader will track the definition of these types and include them in the types for the scope invocation:
// OK
ctx.relation("organization:users").scope("active", [1, 2, 3]).loadFor(org);
// TS error, excludeUsers expected:
ctx.relation("organization:users").scope("active").loadFor(org);
Dynamic scope creation
If you need to create the scopes dynamically, based on the
Aggregates
When aggregating, we assume we're grouping based on the relation column
Tracking Column Usage
Relation Naming Conventions
Naming conventions are similar to other libraries / frameworks, such as Rails, Laravel, or Django. The foreign key of a table is assumed to be the concatenation of the source table's name and the primary key of the referenced table, unless otherwise specified by the .setKeys
method.
user
- id
- name
- created_at
post
- id
- title
- user_id
user -> hasMany(post)
post -> belongsTo(user)
One-To-One Association
hasOne
A hasOne
relationship defines a one-to-one relationship between the source table and the target table, where the source table is assumed to have exactly one reference to the target table.
({
user: (t) => ({
userInfo: t.hasOne("user_info").orThrow,
optionalMeta: t.hasOne("user_meta"),
}),
});
select user_info.* from user_info where user_id in (...) -- (user.id)
belongsTo
The belongsTo
defines a scenario where the source table refers to another model, by the primary key reference of the target table, via the primary key reference in its own table.
({
userInfo: (t) => ({
user: t.belongsTo("user"),
}),
});
hasOneThrough
The "hasOneThrough" relationship isn't too common, it defines a one-to-one relationship with between a source table and a target table, joined through a third joining table.
({
session: (t) => ({
userInfo: t.hasOneThrough("userInfo", "user"),
}),
});
select user_info.*, user.id as user$$id
inner join user on user.id = user_info.user_id
where user_info.user_id in (...) -- session.user_id
One-To-Many Association
hasMany
A hasMany
is a fairly common one-to-many relationship, where the source table has zero or more of the target table. This is defined by default convention where the target table having the source table, suffixed by the source table's primary key.
({
user: (t) => ({
posts: t.hasMany("post"),
}),
});
select post.* from post where user_id in (...) -- (user.id)
hasManyThrough
The hasManyThrough
relationship provides a way to access a table through an intermediate table, where the intermediate table is its own entity.
({
user: (t) => ({
payments: t.hasManyThrough("payment", "invoice"),
}),
});
ctx.relation("user:payments").loadFor(user);
select payment.* from payment
inner join invoice on invoice.id = payment.invoice_id
where invoice.user_id in (...) -- (user.id)
Many-To-Many Association
manyToMany
Similar to the belongsToMany
in Laravel, or the has_and_belongs_to_many
in Rails, the manyToMany
defines a relationship between two tables, through a third table, where the intermediate table contains the primary key references to the two tables it's joining.
({
user: (t) => ({
projects: t.manyToMany("project", "projectUser"),
}),
});
Modifying Relation Keys
For any relation, it is possible to modify the keys to suit the relation, such as in a situation where the relation is self referential, or the keys don't map exactly to the standard naming assumptions
({
project: {
// a project can have a template, which refers to the same table
template: t.belongsTo("project").setKeys({ sourceKey: "template_id" }),
},
});
select project.* from project where id in (...) -- source.template_id
Errors
TableLoaderError
RelationNotFoundError
RowNotFoundError
ColumnNotLoadedError
TODO Before Launch
- [x] CamelCase handling
- [x] Pluralization handling
- [ ] Aggregation API
- [ ] Pagination API Configuration
- [ ] Example of setSelections use
- [ ] Api for detecting column usage
- [x] Decide on relation types
- [ ] Opt-out for union all
- [ ] Scopes (global, local)
- [ ] Cleanup cache key structure to read a bit better
- [ ] Finish off Docs
- [ ] Benchmarks
Roadmap:
- [ ] Consider pulling more introspection info, nullability & foreign keys
- [ ] Windowing
- [ ] Configuring when to use union vs. not
License
MIT