keyql
v0.4.2
Published
A language for querying datasets using key-value pairs
Downloads
866
Readme
KeyQL
KeyQL is a language, specification and package for querying datasets using key-value pairs. It is heavily inspired by the simplicity and ease-of-use of Django and similar ORMs. The provided Node.js package can be used to filter large JSON datasets from within any codebase, but the primary purpose of KeyQL is to be used with FunctionScript APIs, where JSON or HTTP Query Parameter key-value pairs can be used to encode query requests to underlying datasets.
KeyQL is meant for easy querying of JSON datasets, spreadsheet data, information retrieved from APIs such as Airtable and more. It can be used to add robust querying capabilities to existing APIs without a massive architectural lift and shift.
The motivation for KeyQL differs from that of GraphQL. KeyQL is intended to provide a simple querying interface to existing imperative APIs and relatively flat datasets. The operators (comparators) are the most important feature and are meant to be easily interpretable by even the newest developer. KeyQL and GraphQL can, in theory, coexist within a single codebase or API implementation. KeyQL is not intended to be used to define an entire backend architecture and provides no opinions on the graph-based structure of output data (you do not define schemas with it).
Quick Example
A quick example of using KeyQL with a FunctionScript API would look like:
Filename: /dataset.json
[
{
"id": 1,
"fields": {
"name": "Alice",
"birthdate": "12/01/1988",
"pets": 2
}
},
{
"id": 2,
"fields": {
"name": "Bernard",
"birthdate": "11/11/1972",
"pets": 5
}
},
{
"id": 3,
"fields": {
"name": "Christine",
"birthdate": "01/05/1991",
"pets": 0
}
}
]
Filename: /functions/__main__.js
const KeyQL = require('keyql');
const dataset = require('../dataset.json');
// Searching through the "fields" object in each row
const kqlDataset = new KeyQL(dataset, row => row.fields);
/**
* Query a dataset based on an Array of Objects
* @param {object.keyql.query} where A list of fields to query for
* @returns {array} result The result list
*/
module.exports = async (where = {}) => {
return kqlDataset.query()
.select([where]) // Wrap in array if provided a raw object
.values();
};
An HTTP POST request containing:
{
"where": {
"pets__gt": 3
}
}
Would return:
[
{
"id": 2,
"fields": {
"name": "Bernard",
"birthdate": "11/11/1972",
"pets": 5
}
}
]
Table of Contents
Introduction
By adhering to the KeyQL specification, your developers and users will have a significantly easier time learning how to work with your APIs and datasets.
For example, you may have an HTTP API built on Autocode.
In this project, you have set up a users/select
endpoint and want to query your
user dataset for every username that contains "ke"
in a case-insensitive
fashion.
HTTP POST https://$user.api.stdlib.com/project@dev/users/select
{
"query": {
"username__icontains": "ke"
}
}
With the intended response being something like:
[
{
"username": "Kelly",
"profile_image": "boop.jpg"
},
{
"username": "Kevin",
"profile_image": "snoot.jpg"
}
]
The KeyQL specification removes the cognitive overhead of choosing how to structure your query requests.
Specification
The KeyQL specification is heavily inspired by Django's ORM and over five years of work manipulating datasets on both the front and back-end of web projects, primarily working with JSON and SQL queries. It's the culmination of best practices learned implementing DataCollection.js and Nodal's ORM.
Writing Queries
Writing KeyQL Queries is as simple as preparing a JSON Object. For example, in a dataset that has records that look like...
// Example dataset in JavaScript
[
{
first_name: 'Dolores',
last_name: 'Abernathy',
is_host: true,
eye_color: 'blue',
hair_color: 'blonde',
location_in_park: null,
age: 250
}
]
You could write a query against it that returns...
Query: All entries with first_name
= Dolores
[
{
"first_name": "Dolores"
}
]
Query: first_name
= Dolores
AND eye_color
in blue
, green
[
{
"first_name": "Dolores",
"eye_color__in": ["blue", "green"]
}
]
Query: first_name
= Dolores
OR first_name
= Teddy
[
{
"first_name": "Dolores"
},
{
"first_name": "Teddy"
}
]
Supported Operators
All operators in KeyQL queries are preceded by a __
delimiter. To reiterate
from the previous section, this means you can query the field first_name
with;
"first_name" // (default to "is" operator)
"first_name__is"
"first_name__startswith"
"first_name__gte"
Full List of Supported Operators
The following table assumes that queryValue
is the value you're searching for
provided a specified key, and entryValue
is the matching entry in a dataset.
| Operator | Behavior |
| -------- | -------- |
| is | Finds all matching entries. Returns entryValue === queryValue
(exact match, type included). |
| not | Finds all non-matching entries. Returns entryValue !== queryValue
(exact match, type included). |
| gt | Finds all entries greater than specified value. Returns entryValue > queryValue
. |
| gte | Finds all entries greater than or equal to specified value. Returns entryValue >= queryValue
. |
| lt | Finds all entries less than specified value. Returns entryValue < queryValue
. |
| lte | Finds all entries less than or equal to specified value. Returns entryValue <= queryValue
. |
| contains | Finds all entries containing the exact provided value. Works when entryValue
is a string
or an array
. |
| icontains | Finds all entries containing the provided value, case-insensitive. Works when entryValue
is a string
or an array
. |
| startswith | Finds all entries starting with the exact provided value. Works when entryValue
is a string
.
| istartswith | Finds all entries starting with the provided value, case-insensitive. Works when entryValue
is a string
. |
| endswith | Finds all entries ending with the exact provided value. Works when entryValue
is a string
. |
| iendswith | Finds all entries ending with the provided value, case-insensitive. Works when entryValue
is a string
. |
| is_null | Finds all entries where entryValue === null
, queryValue
is ignored. |
| is_true | Finds all entries where entryValue === true
, queryValue
is ignored. |
| is_false | Finds all entries where entryValue === false
, queryValue
is ignored. |
| not_null | Finds all entries where entryValue !== null
, queryValue
is ignored. |
| not_true | Finds all entries where entryValue !== true
, queryValue
is ignored. |
| not_false | Finds all entries where entryValue !== false
, queryValue
is ignored. |
| in | Finds all entries within the provided value, intended to match when queryValue
is an array
but works with string
input. |
| not_in | Finds all entries not in the provided value, intended to match when queryValue
is an array
but works with string
input. |
| recency_lt | Finds all entries where DATE(entryValue)
is recent within less than queryValue
in number of seconds. i.e. "field__recency__lt": 3600
would look for entries that have field
as a date/timestamp within the past hour (exclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| recency_lte | Finds all entries where DATE(entryValue)
is recent within less than or equal to queryValue
in number of seconds. i.e. "field__recency__lte": 3600
would look for entries that have field
as a date/timestamp within the past hour (inclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| recency_gt | Finds all entries where DATE(entryValue)
has a recency greater than queryValue
in number of seconds. i.e. "field__recency__gt": 3600
would look for entries that have field
as a date/timestamp outside the past hour (exclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| recency_gte | Finds all entries where DATE(entryValue)
has a recency greater than or equal to queryValue
in number of seconds. i.e. "field__recency__gte": 3600
would look for entries that have field
as a date/timestamp outside the past hour (inclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| upcoming_lt | Finds all entries where DATE(entryValue)
is going to occur within less than queryValue
in number of seconds. i.e. "field__upcoming_lt": 3600
would look for entries that have field
as a date/timestamp within the next hour (exclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| upcoming_lte | Finds all entries where DATE(entryValue)
is going to occur within less than or equal to queryValue
in number of seconds. i.e. "field__upcoming_lte": 3600
would look for entries that have field
as a date/timestamp within the next hour (inclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| upcoming_gt | Finds all entries where DATE(entryValue)
is going to occur within greater than queryValue
in number of seconds. i.e. "field__upcoming_gt": 3600
would look for entries that have field
as a date/timestamp outside the next hour (exclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| upcoming_gte | Finds all entries where DATE(entryValue)
is going to occur within greater than or equal to queryValue
in number of seconds. i.e. "field__upcoming_gte": 3600
would look for entries that have field
as a date/timestamp outside the next hour (inclusive). ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| date_lt | Finds all entries where DATE(entryValue)
is less than DATE(queryValue)
, i.e. '12-06-1988' < '01-01-2019'. ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| date_lte | Finds all entries where DATE(entryValue)
is less than or equal to DATE(queryValue)
, i.e. '12-06-1988' <= '12-06-1988'. ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| date_gt | Finds all entries where DATE(entryValue)
is greater than DATE(queryValue)
, i.e. '12-06-1988' > '01-01-1980'. ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
| date_gte | Finds all entries where DATE(entryValue)
is greater than or equal to DATE(queryValue)
, i.e. '12-06-1988' >= '12-06-1988'. ISO8601 Timestamps suggested, if no timezone entered UTC will be assumed. |
Installation and Usage
The KeyQL implementation provided as part of this GitHub repository is intended
for use in the Node.js ecosystem, using the package keyql
. Right now, it can
be used to automatically filter JSON datasets (arrays of objects) based on a specified
query. In the future, we intend to migrate the Nodal Query Composer (ORM)
to be able to automatically generate SQL queries from a provided KeyQL statement.
You can install the package simply using Node.js (10 or higher) and NPM:
$ npm i keyql --save
And use it in your Node.js project with:
const KeyQL = require('keyql');
let dataset = [/* my dataset */]; // Your array of objects
const myDataset = new KeyQL(dataset);
myDataset.query()
.select({value__gte: 5})
.values(); // gets all records where "value" is > 5
Node.js Examples
const KeyQL = require('keyql');
const kqlDataset = new KeyQL(
[
{
"id": 1,
"fields": {
"name": "Alice",
"birthdate": "12/01/1988",
"pets": 2
}
},
{
"id": 2,
"fields": {
"name": "Bernard",
"birthdate": "11/11/1972",
"pets": 5
}
},
{
"id": 3,
"fields": {
"name": "Christine",
"birthdate": "01/05/1991",
"pets": 0
}
}
],
row => row.fields // mapping
);
// Basic query
kqlDataset.query()
.select([{pets__gt: 3}])
.values(); // gives Bernard
// OR query is adding additional array elements
kqlDataset.query()
.select([{pets__gt: 3}, {pets__lt: 2}])
.values(); // gives Bernard, Christine
// AND query is additional parameters in a single object
kqlDataset.query()
.select([{name__in: ['Bernard', 'Christine'], pets: 5}])
.values(); // gives Bernard
// Chaining will continue to filter datasets
kqlDataset.query()
.select([{name__in: ['Bernard', 'Christine']}])
.select([{pets__lt: 5})
.values(); // gives Christine
// Updating can modify the base dataset
kql.query()
.select([{birthdate__date_gt: '01-01-1980'}])
.update({pets: 100}); // Sets Alice and Christine to have 100 pets
KeyQL
The main KeyQL constructor. Used to wrap datasets to be able to query them.
KeyQL#constructor
constructor (dataset = [], mapFunction = v => v)
Initializes a KeyQL Dataset
dataset
is anarray
of objects you wish to parse throughmapFunction
gives us information on how to query each object in a dataset
By default, mapFunction
is a no-op. This works when your dataset looks like:
[
{"id": 1, "name": "Jane", "age": 27},
{"id": 2, "name": "Stewart", "age": 43}
]
However, your dataset may not be this straightforward. Some APIs return nested field values. For example, in the case of something like:
[
{
"id": 1,
"fields": {"name": "Jane", "age": 27}
},
{
"id": 2,
"fields": {"name": "Stewart", "age": 43}
}
]
We would provide the mapFunction
as v => v.fields
instead.
KeyQL#keys
keys ()
Returns the keys for the Dataset, extracted from the first object provided.
KeyQL#rows
rows ()
Returns all rows in the Dataset as initially provided (mapFunction
applied).
KeyQL#dataset
dataset ()
Returns all entries in the Dataset as initially provided (no mapFunction
applied).
KeyQL#changeset
changeset ()
Returns all entries in the dataset that have been updated as a result of KeyQLQueryCommand#update
method calls. To create a copy of your dataset with all new changes committed
(reset the updated rows tracker), use KeyQL#commit
KeyQL#commit
commit ()
Returns a copy of your KeyQL instance with the same dataset, but the changeset will have been reset.
KeyQL#query
query ()
Instantiates a KeyQLQueryCommand
instance, which will create an immutable
history of all query commands.
KeyQLQueryCommand
Created from KeyQL#query
, an immutable record of a query history. Can be
chained indefinitely without overwriting previous KeyQLQueryCommand
data.
For example:
let kqlDataset = new KeyQL([/* my dataset */]);
let q1 = kqlDataset.query().select({first_name: 'Tim'});
let q2 = q1.select({age__gt: 20});
q1 === q2; // false
q1.values(); // Everybody named "Tim"
q2.values(); // Everybody named "Tim" with age > 20
KeyQLQueryCommand#select
Returns a new KeyQLQueryCommand
instance with a select
command added.
Used to select values given a KeyQLQuery
.
select (keyQLQuery = [])
keyQLQuery
is anarray
ofobjects
intended to be used as the query
KeyQLQueryCommand#limit
Returns a new KeyQLQueryCommand
instance with a limit
command added.
Used to select values given a KeyQLLimit
.
select (keyQLLimit = {offset: 0, count: 0})
keyQLLimit
is anobject
containing theoffset
andcount
of records to return
KeyQLQueryCommand#values
Executes a query and returns a subset of your primary dataset
based
on previous KeyQLQueryCommand
s in the chain.
values ()
Will return an array
of objects
from your primary dataset
.
KeyQLQueryCommand#update
Executes a query and returns a subset of your primary dataset
based
on previous KeyQLQueryCommand
s in the chain. Updates all values with the fields
provided.
update (fields = {})
fields
is anobject
containing key-value pairs you wish to update for match entries
Will update and return an array
of objects
from your primary dataset
.
Comparison to GraphQL
KeyQL is not meant to act as a stand-in or replacement for GraphQL. You can think of it more like a lightweight cousin: a midpoint between the wild-west of loosely opinionated SOAP and REST requests and the highly-structured, opinionated and complex world of GraphQL.
Whereas GraphQL provides an interface and opinions around manipulating large, graph-structured datasets and can define an entire backend architecture, KeyQL takes a more minimalistic and piecemeal approach -- providing a simple structure for writing queries using JSON.
Acknowledgements
Thanks for checking out KeyQL. There's a lot more to come as the API is improved! There have been a number of helpful supporters and contributors along the way, and both KeyQL and FunctionScript would not be possible without any of them.
Core Contributors
Thanks to Airtable
A special thanks to Airtable for providing a wonderful product and service. It has allowed our team to focus more on making backend tooling and development accessible to a larger number of developers and developers-to-be. Without it, this project would be unlikely to exist in present form.
Roadmap
- (High Priority) Support type coercion of
entryValue
andqueryValue
- (Low Priority) PostgreSQL Support (re: Nodal)
KeyQL is (c) 2021 Polybit Inc.