clickhouse-query
v1.10.2
Published
ClickHouse Query is intuitive query builder to overcome the complexity of ClickHouse SQL syntax.
Downloads
158
Readme
ClickHouse Query
ClickHouse Query is intuitive query builder to simplify usage of ClickHouse.
Table of Contents
Features
- Sub-queries
- Table aliases
- Query aliases
- SELECT
- INSERT
- WITH clause
- JOINs (all types)
- WHERE/grouped WHERE
- GROUP BY
- ORDER BY
- LIMIT
- OFFSET
- HAVING
- Helper functions, e.g.
fx.avg()
,fx.countIf()
, etc - Custom SQL expressions with
expr()
Usage
Installation
Using yarn:
yarn add clickhouse-query
Using npm:
npm install clickhouse-query
Once the package is installed, you can import the library as follows:
import {QueryBuilder, fx, expr, schema} from 'clickhouse-query';
Quick start
import {ClickHouse} from 'clickhouse';
import winston from 'winston';
import {QueryBuilder} from 'clickhouse-query';
const clickhouse = new ClickHouse({
url: 'http://localhost',
port: 8123,
basicAuth: {username: 'user', password: 'password'},
format: 'json',
raw: false,
});
const logger = winston.createLogger(); // not required, you can pass as null
const builder = new QueryBuilder(clickhouse, logger);
const users = await builder.query()
.select('email')
.from('users')
.execute();
// Executes: SELECT email FROM users
TypeScript example:
// ...
type User = {
email: string;
};
const users = await builder.query()
.select('email')
.from('users')
.execute<User[]>();
// Executes: SELECT email FROM users
CREATE TABLE
Creating tables as simple as this:
import {schema} from 'clickhouse-query';
await builder.createTable()
.table('table_name')
.column('column1', schema.string())
.engine('Memory')
.execute();
// Executes: CREATE TABLE table_name(column1 String) ENGINE = Memory
Also, you can provide multiple columns to create:
import {schema} from 'clickhouse-query';
await builder.createTable()
.table('table_name')
.column('column1', schema.string())
.column('column_date', schema.dateTime())
.engine('Memory')
.execute();
// Executes: CREATE TABLE table_name(column1 String, column_date DateTime) ENGINE = Memory
Create table with ORDER BY
:
import {schema} from 'clickhouse-query';
await builder.createTable()
.table('table_name')
.column('column1', schema.string())
.column('column_date', schema.dateTime())
.engine('MergeTree()')
.orderBy(['column1', 'column_date'])
.execute();
// Executes: CREATE TABLE table_name(column1 String, column_date DateTime) ENGINE = MergeTree() ORDER BY (column1, column_date)
Create table with IF NOT EXISTS
:
import {schema} from 'clickhouse-query';
await builder.createTable()
.table('table_name')
.ifNotExists()
.column('column1', schema.string())
.engine('Memory')
.execute();
// Executes: CREATE TABLE IF NOT EXISTS table_name(column1 String) ENGINE = Memory
ALTER TABLE
Builder has special method called alterTable()
to handle ALTER TABLE queries. Below you may find a couple of examples.
Add column:
import {schema} from 'clickhouse-query';
import {AddColumn} from 'clickhouse-query/AlterTable/AddColumn';
await builder.alterTable()
.table('table_name')
.addColumn((new AddColumn()).name('column1').type(schema.string()))
.execute();
// Executes: ALTER TABLE table_name ADD COLUMN column1 String
Drop column:
import {DropColumn} from 'clickhouse-query/AlterTable/DropColumn';
await builder.alterTable()
.table('table_name')
.dropColumn((new DropColumn()).name('column1'))
.execute();
// Executes: ALTER TABLE table_name DROP COLUMN column1
Rename column:
import {RenameColumn} from 'clickhouse-query/AlterTable/RenameColumn';
await builder.alterTable()
.table('table_name')
.renameColumn((new RenameColumn()).from('column1').to('column2'))
.execute();
// Executes: ALTER TABLE table_name RENAME COLUMN column1 TO column2
Modify column:
import {schema} from 'clickhouse-query';
import {ModifyColumn} from 'clickhouse-query/AlterTable/ModifyColumn';
await builder.alterTable()
.table('table_name')
.modifyColumn((new ModifyColumn()).modify().name('column1').type(schema.string()))
.execute();
// Executes: ALTER TABLE table_name MODIFY COLUMN column1 String
Modify column with AFTER
:
import {schema} from 'clickhouse-query';
import {ModifyColumn} from 'clickhouse-query/AlterTable/ModifyColumn';
await builder.alterTable()
.table('table_name')
.modifyColumn((new ModifyColumn()).modify().name('column1').type(schema.string()).after('column2'))
.after('column2')
.execute();
// Executes: ALTER TABLE table_name MODIFY COLUMN column1 String AFTER column2
Modify column with FIRST
:
import {schema} from 'clickhouse-query';
import {ModifyColumn} from 'clickhouse-query/AlterTable/ModifyColumn';
await builder.alterTable()
.table('table_name')
.modifyColumn((new ModifyColumn()).modify().name('column1').type(schema.string()).first())
.first()
.execute();
// Executes: ALTER TABLE table_name MODIFY COLUMN column1 String FIRST
INSERT
Builder has special method called insert()
to handle INSERT queries. Below you may find a couple of examples.
Insert single row:
await builder.insert()
.into('metrics')
.columns(['id', 'ip', 'created_date'])
.values({id: 1, ip: '127.0.0.1', created_date: '2022-12-20'})
.execute();
// Executes: INSERT INTO metrics (id, ip, created_date) VALUES (1, '127.0.0.1', '2022-12-20')
Definition of columns()
is optional, you can use values()
without it. values()
will use the first row to determine
the columns.
await builder.insert()
.into('metrics')
.values({id: 1, ip: '127.0.0.1', created_date: '2022-12-20'})
.execute();
// Executes: INSERT INTO metrics (id, ip, created_date) VALUES (1, '127.0.0.1', '2022-12-20')
You can chain multiple rows using values()
:
await builder.insert()
.into('metrics')
.columns(['id', 'ip', 'created_date'])
.values({id: 1, ip: '127.0.0.1', created_date: '2022-12-20'})
.values({id: 2, ip: '127.0.0.2', created_date: '2022-12-21'})
.execute();
// Executes: INSERT INTO metrics (id, ip, created_date) VALUES (1, '127.0.0.1', '2022-12-20'), (2, '127.0.0.2', '2022-12-21')
You can write bulk rows (same as above):
await builder.insert()
.into('metrics')
.columns(['id', 'ip', 'created_date'])
.values([
{id: 1, ip: '127.0.0.1', created_date: '2022-12-20'},
{id: 2, ip: '127.0.0.2', created_date: '2022-12-21'}
])
.execute();
// Executes: INSERT INTO metrics (id, ip, created_date) VALUES (1, '127.0.0.1', '2022-12-20'), (2, '127.0.0.2', '2022-12-21')
DELETE
Builder has special method called delete()
to handle DELETE queries. Below you may find a couple of examples.
await builder.delete()
.table('metrics')
.where('created_date', '>', '2022-12-20')
.execute();
// Executes: ALTER TABLE metrics DELETE WHERE created_date > '2022-12-20'
If you want to delete everything from table use it as following:
await builder.delete()
.table('metrics')
.all()
.execute();
// Executes: ALTER TABLE metrics DELETE WHERE 1 = 1
Alternatively, you write example above as following:
import {expr} from 'clickhouse-query';
await builder.delete()
.table('metrics')
.where(expr('1'), '=', 1)
.execute();
// Executes: ALTER TABLE metrics DELETE WHERE 1 = 1
UPDATE
Builder has special method called update()
to handle UPDATE queries. Below you may find a couple of examples.
Update single column:
await builder.update()
.table('metrics')
.value('ip', '127.0.0.1')
.where('ip', '=', '127.0.0.0')
.execute();
// Executes: ALTER TABLE metrics UPDATE ip = '127.0.0.1' WHERE ip = '127.0.0.0'
Update multiple columns chained:
await builder.update()
.table('metrics')
.value('ip', '127.0.0.1')
.value('user_agent', 'Googlebot/2.1')
.where('ip', '=', '127.0.0.0')
.execute();
// Executes: ALTER TABLE metrics UPDATE ip = '127.0.0.1', user_agent = 'Googlebot/2.1' WHERE ip = '127.0.0.0'
Alternatively, you can use values()
to update multiple columns:
await builder.update()
.table('metrics')
.values([
['ip', '127.0.0.1'],
['user_agent', 'Googlebot/2.1'],
])
.where('ip', '=', '127.0.0.0')
.execute();
// Executes: ALTER TABLE metrics UPDATE ip = '127.0.0.1', user_agent = 'Googlebot/2.1' WHERE ip = '127.0.0.0'
You can pass array as value, it would be properly converted:
await builder.update()
.table('metrics')
.value('ips', ['127.0.0.1', '127.0.0.2'])
.where('id', '=', 1)
.execute();
// Executes: ALTER TABLE metrics UPDATE ips = ['127.0.0.1', '127.0.0.2'] WHERE id = 1
SELECT
Builder has special method called query()
which allows you to build SELECT queries. Below you may find a couple of
examples.
Select single column:
await builder.query()
.select('id')
.from('users')
.execute();
// Executes: SELECT id FROM users
Select multiple columns:
await builder.query()
.select(['id', 'email'])
.from('users')
.execute();
// Executes: SELECT id, email FROM users
Select from sub-query:
await builder.query()
.select(['ip'])
.from(
builder.query()
.select('ip')
.from('metrics')
)
.execute();
// Executes: SELECT ip FROM (SELECT ip FROM metrics)
Select with alias:
await builder.query()
.select(['ip'])
.from(
builder.query()
.select('ip')
.from('metrics')
)
.as('m')
.execute();
// Executes: (SELECT ip FROM metrics) AS m
FINAL
Generates SQL query with FINAL
keyword.
Useful for tables which use ReplacingMergeTree engine to get rid-off duplicate entries.
await builder.query()
.select(['id', 'email'])
.final()
.from('users')
.where('id', '=', 1)
.execute();
// Executes: SELECT id, email FROM users FINAL WHERE id = 1
FROM
Select with table alias:
await builder.query()
.select('id')
.from('users', 'u')
.execute();
// Executes: SELECT id FROM users u
WHERE
The following operators are supported:
=
<
>
>=
<=
!=
BETWEEN
IN
NOT IN
LIKE
NOT LIKE
IS NULL
IS NOT NULL
Simple condition:
await builder.query()
.select(['email'])
.from('users')
.where('status', '=', 10)
.execute();
// Executes: SELECT email FROM users WHERE status = 10
Where with AND condition:
await builder.query()
.select(['email'])
.from('users')
.where('status', '>', 10)
.andWhere('email', '=', '[email protected]')
.execute();
// Executes: SELECT email FROM users WHERE status > 10 AND email LIKE '[email protected]'
Numeric BETWEEN
condition:
await builder.query()
.select(['email'])
.from('users')
.where('id', 'BETWEEN', [1, 100])
.execute();
// Executes: SELECT email FROM users WHERE created_date BETWEEN 1 AND 100
Date BETWEEN
condition:
await builder.query()
.select(['email'])
.from('users')
.where('created_date', 'BETWEEN', ['2022-01-01', '2022-01-31'])
.execute();
// Executes: SELECT email FROM users WHERE created_date BETWEEN '2022-01-01' AND '2022-01-31'
IN
/NOT IN
condition:
await builder.query()
.select(['email'])
.from('users')
.where('id', 'IN', [1, 2, 3])
.execute();
// Executes: SELECT email FROM users WHERE id IN (1, 2, 3)
Sub-query example:
await builder.query()
.select(['email'])
.from('users')
.where('id', 'IN', builder.query().select(['id']).from('test2'))
.execute();
// Executes: SELECT email FROM users WHERE id IN (SELECT id FROM test2)
LIKE
/NOT LIKE
condition:
await builder.query()
.select(['email'])
.from('users')
.where('email', 'LIKE', '%@gmail.com')
.execute();
// Executes: SELECT email FROM users WHERE email LIKE '%@gmail.com'
HAVING
await await builder
.select([
'repo_name',
fx.sum(expr("event_type = 'ForkEvent'")).as('forks'),
fx.sum(expr("event_type = 'WatchEvent'")).as('stars'),
fx.round(expr('stars / forks'), 2).as('ratio'),
])
.from('github_events')
.where('event_type', 'IN', ['ForkEvent', 'WatchEvent'])
.groupBy(['repo_name'])
.orderBy([['ratio', 'DESC']])
.having('stars', '>', 100)
.andHaving('forks', '>', 100)
.limit(50)
.execute();
// Executes: SELECT repo_name, sum(event_type = 'ForkEvent') AS forks, sum(event_type = 'WatchEvent') AS stars, round(stars / forks, 2) AS ratio FROM github_events WHERE event_type IN ('ForkEvent', 'WatchEvent') GROUP BY repo_name HAVING stars > 100 AND forks > 100 ORDER BY ratio DESC LIMIT 50
JOIN
By default, if you provide JOIN
, INNER JOIN
would be used.
You may chain as multiple joins if needed.
await builder.query()
.select(['id', 'first_name'])
.from('users', 'u')
.join(
'INNER JOIN',
getQuery()
.select(['user_id'])
.from('posts')
.where('id', '>', 1),
'p',
'p.user_id = u.user_id'
)
.execute();
// Executes: SELECT id, first_name FROM users AS u INNER JOIN (SELECT user_id FROM posts WHERE id > 1) AS p ON p.user_id = u.user_id
LIMIT/OFFSET
await builder.query()
.select(['id', 'first_name'])
.from('users')
.limit(10)
.offset(0)
.generateSql();
// Executes: SELECT id, first_name FROM users OFFSET 0 ROW FETCH FIRST 10 ROWS ONLY
WITH
await builder.query()
.with([
expr("toStartOfDay(toDate('2021-01-01'))").as('start'),
expr("toStartOfDay(toDate('2021-01-02'))").as('end'),
])
.select([
fx.arrayJoin(
expr('arrayMap(x -> toDateTime(x), range(toUInt32(start), toUInt32(end), 3600))')
)
])
.execute();
// Executes: WITH toStartOfDay(toDate('2021-01-01')) AS start, toStartOfDay(toDate('2021-01-02')) AS end SELECT arrayJoin(arrayMap(x -> toDateTime(x), range(toUInt32(start), toUInt32(end), 3600)))
Using constant expression as "variable":
import {expr} from 'clickhouse-query';
await builder.query()
.with('2019-08-01 15:23:00', 'ts_upper_bound')
.select('*')
.from('hits')
.where('EventDate', '=', expr('toDate(ts_upper_bound)'))
.andWhere('EventTime', '<', expr('ts_upper_bound'))
.execute();
// Executes: WITH '2019-08-01 15:23:00' AS ts_upper_bound SELECT * FROM hits WHERE EventDate = toDate(ts_upper_bound) AND EventTime < ts_upper_bound
Using results of a scalar subquery:
import {fx, expr} from 'clickhouse-query';
await builder.query()
.with([
q2.select([fx.sum(expr('bytes'))])
.from('system.parts')
.where('active', '=', 1)
.as('total_disk_usage')
])
.select([expr('(sum(bytes) / total_disk_usage) * 100 AS table_disk_usage'), expr('table')])
.from('system.parts')
.groupBy(['table'])
.orderBy([['table_disk_usage', 'DESC']])
.limit(10)
.execute();
// Executes: WITH (SELECT sum(bytes) FROM system.parts WHERE active = 1) AS total_disk_usage SELECT (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage, table FROM system.parts GROUP BY table ORDER BY table_disk_usage DESC LIMIT 10
Helper Functions
fx
Use fx
helper to access ClickHouse functions.
All helpers are simply wrappers which add extra syntax sugaring to help your IDE hint function arguments.
import {fx} from 'clickhouse-query';
Usage example:
import {fx} from 'clickhouse-query';
await builder.query()
.select([
'user_id',
fx.sum('trade_volume').as('volume')
])
.from('user_spending')
.groupBy(['user_id'])
.execute();
// Executes: SELECT user_id, sum(trade_volume) AS volume FROM user_spending GROUP BY user_id
List of available helpers:
anyLast
anyLastPos
groupArray
arrayJoin
indexOf
empty
notEmpty
min
max
avgIf
sum
count
countDistinct
countIf
abs
if
round
subtractDays
positionCaseInsensitive
translateUTF8
schema
Use schema
helper to access ClickHouse data types. This helper can be used when creating tables or altering changes.
import {schema} from 'clickhouse-query';
await builder.createTable()
.table('table_name')
.column('column1', schema.string())
.engine('Memory')
.execute();
// Executes: CREATE TABLE table_name(column1 String) ENGINE = Memory
More examples
For further query examples you can check __tests__
folder.
Tests
Tests could be found in __tests__
folder.
Run tests:
yarn tests