@windsor/wql

v0.0.9

Published

3 years ago

The compilers and the spec for the windsor query language

Downloads

0High
0Medium
0Low

atfaust2

pranaygp

WQL

Purpose

The purpose of wql is to provide a platform agnostic means to query Windsor data.

Why not use an already built query language (SQL, ES queries, ...)?

Windsor data is stored across many platforms. For instance the bulk of Windsor data is stored in postgres which faciliates quick ingestion and allows for complex queries. A large amount of data is also stored in elasticsearch for quick search and indexing. Data may also be temporarily stored in memory where it might be processed further. By building a query language which is platform agnostic we are able to use the same spec to represent a set of data no matter where it is located. This is especially useful when the most efficient solution involves queries across multiple data sources.

Example

Because of the way that windsor stores users, finding the intersection of two user sets involves a complex and costly sql query. For teams with less than 100,000 users is more efficient to pull the data into memory and calculate the intersection. With WQL, one simply defines the spec and the interpreters will decide which parts of the filter should be implemented in SQL, and which parts should be impleneted in memory for optimal performance.

Language

WQL is defined by a very simple grammar.

Root -> BoolExpr | CountExpr | Selector

BoolExpr -> [BoolExpr | CountExpr | Selector]

CountExpr -> [BoolExpr | CountExpr | Selector]

Selector -> [PropertyFilter]

Root

Expr
- values
  - BoolExpr
  - CountExpr
  - Selector
Optimization
- An array describing what type of optimization should be enabled.
- values
  - restrictMultisets: only use multisets as a return value if one of the parents of the current node is a count
rtnType
- Determines what the root should return.This will usually fundamentally change how the spec is executed.
- values
  - User * Returns a list of users that match the spec. In this case event filters are interpreted as "Users that have these events which match this filter"
  - Event * Returns a list of events that match the spec. In this case user filters are taken to be "Consider events from users that match this filter"
  - BooleanUser * Along with a user-id, returns whether or not the user matches the spec
  - BooleanEvent * Along with an event-id, returns whether or not the user matches the spec

BoolExpr

Op
- What operation we should perform to combine the children.
  - values
    - !
      - Expects one child.
      - For a user return type, returns all users not returned by the child
      - For an event return type returns all events that aren't returned by the child.
      - In the case of multisets for either previous case, returns a multiset with count: 1 for all Events/Users not in the child's return value (You can't 'not' have an event more than once)
      - For a boolean type, does a boolean not on the child's return value
    - |
      - For User and Event return types, returns a union of the children's return values.
      - In the case of a multiset, a Union is an addition of the counts of each element in each child set - For example, if an element 'foo' appears 1 time in 3 children, 3 times in another child and not at all in the remaining children, this node will have { 'foo': 6 } in it's return
    - &
      - For User and Event return types, returns an intersection of the children.
      - In the case of a multiset, an intersection is a set of objects with the count on each object being the Min(count of that object in all children)
Children * The children to perform the operation on - For example, if there are two children, and 'foo' appears 1 time in one child, and 6 times in another, the return of this node will contain { 'foo': 1}. - Note that if 'foo' does not appear at all in even one of the children foo will not be returend at all by this node as it's count will be 0.

CountExpr

This operation Looks through all the children and counts the total amount of times that an object appears. Using the operator a filter is then applied based on the count of the object.

Op
- What is the comparison operation for count?
- values
  - <
  - >
  - =
  - <=
  - >=
  - !=
Children
- The children to perform the operation on

Selector

Subtype
- values
  - Event
  - Means that this selector is to be interpreted as a filter on events
  - User * Means that this selector is to be interpreted as a filter on users
Properties * A list of property filters
Op
- And
- Interpret the Selector as an and of all of the property filters
- Or
- Interpret the Selector as an or of all the property filters

PropertyFilter

Op
- values
  - =
    - Takes a value of a type that is castable to the type of property
    - Checks for equality between the property and value
  - !=
    - Takes a value of a type that is castable to the type of property
    - Checks for inequality between the property and value
  - >
    - Takes a value of a type that is castable to the type of property
    - Checks if the property is greater than the value
  - <
    - Takes a value of a type that is castable to the type of property
    - Checks if the property is less than the value
  - beforeTime
    - Take a String of the form "[+|-][a] [seconds, minutes, hours, days, weeks, months, years]"
    - Checks whether the property is before now() plus or minus the offset specified in the value field
    - Expects Property to be a time field, or castable to a time
  - afterTime
    - Take a String of the form "[+|-][a] [seconds, minutes, hours, days, weeks, months, years]"
    - Checks whether the property is after now() plus or minus the offset specified in the value field
    - Expects Property to be a time field, or castable to a time
  - range
    - Takes a string of the form "a:b" as the value
    - Checks if the value is between a and b
  - like
    - Takes a string as the value
    - Checks if the value contains the string provided
Prop
- The name of the property that we are filtering by
- Examples: ts, name, user_info__name, email, phone
Value The value that we are using to filter the property

Sql Interpreter Implementation

compileMixed

Takes in a spec and returns a tree where the Selectors are replaced with sql queries

processSqlTree

Takes in the tree returned from compiled mixed and fires all the sql queries without waiting on them

getResults

Awaits for the results on all of the sql queries fired in processSql tree and uses the Boolean Expression and Count Expression nodes to combine the results in the proper way, returning a list of events, a list of users, or a boolean depending on the rtnType specified on root

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

WQL

Purpose

Why not use an already built query language (SQL, ES queries, ...)?

Example

Language

Root

Expr

values

Optimization

values

rtnType

values

BoolExpr

Op

values

CountExpr

Op

values

Children

Selector

Subtype

values

Properties * A list of property filters

Op

PropertyFilter

Op

values

Sql Interpreter Implementation

compileMixed

processSqlTree

getResults