db-linter

v1.0.9

Published

4 years ago

generate markdown for & enforce documentation & conventions on database

db-linter

Do you wish:

your codebase came with some helpful github-flavored markdown that provided a canonical, easily-linkable place for textual descriptions of database tables and columns to be stored?
and required team members to update them as new ones were added?
and made sure the schema followed certain conventions?
or that your code had a dynamic definition of your database schema?

Then this is for you.

Setup

Run this during your test suite (or just extractDbSchema in your code if you want the db object for your use):

const {extractDbSchema,run} = require('db-linter')
extractDbSchema({
	//sql flavor
	lang: 'postgres',//or 'mysql' (if using mariadb, say 'mysql')
	//db creds
	host: '127.0.0.1',
	//port: 5432,//optional; if empty, assumes 3306 if mysql, 5432 if postgres
	user: 'postgres',//note this user will need access to information_schema
	password: '',
	database:'test',
})
.then(db=>{
	//or write your own convention checks once you have db
	return run(db,{
		path:'./readme.md',//which markdown to place/update the table
		rules:'all',//or array of rule name strings from readme
		//rule options
		boolPrefixes:['is','allow'],
		isObviousColumn:(columnName,tableName,db)=>{
			//custom reasons a column does not need describing in your setup
			//maybe columns that are everywhere, like created_at?
			return false
		}
	})
})
.then( passedConventionCheck => //if this is false, all conventions were not followed
	process.exit(passedConventionCheck ? 0 : 1)//or however you want to handle success / failure
)

Failed rules will be logged out for the dev to fix.

Rules

Below is the full list of built-in rules, but feel free to create your own and assess the json schema directly:

require_table_description_in_readme - all tables need explanations for why they exist. Sometimes even describing table x_y as 1 x can have many y's will be appreciated going forward.
require_column_description_in_readme - all non-obvious columns need explanations for why they exist. ("Non-obvious" is customizable with the isObviousColumn() in setup)
require_lower_snake_case_table_name - some instances, collations, & OSes are case insensitive, making this the only reliable naming style for tables and columns
require_lower_snake_case_column_name - see above.
disallow_bare_id - columns named id have repeatedly been found to create footgun-level ambiguity downstream, and make sql more verbose & confusing by eliminating utility of the using keyword
require_primary_key - each row should always be individually fetchable from each table, otherwise the data structure & author needs may be at odds
require_unique_primary_keys - identical primary keys would suggest they should be the same table
require_singular_table_name - the table name should describe each row, not the table as a whole. A table holds multiple records, otherwise it would be called a pedestal; clarity is never added when a table is pluralized, it only makes remembering which part to pluralize harder when join tables inevitably have singular qualifiers. Also, consider using names for tables that are not SQL keywords or quoting them over pluralizing.
require_all_foreign_keys - every column titled x_id (when x is another table) should have a foreign key to table x. In composite primary key scenarios, this may require denormalizing properties to retain the link.
require_same_name_columns_share_type - reduces confusion when talking & promotes more unique names
require_bool_prefix_on_only_bools - is_, allow_ should always refer to boolean columns. (Prefix list customizable with the boolPrefixes in setup)

`db`

The db object extractDbSchema() creates has this structure:

const db = {
	name:db_name,
	tables:{
		[table_name]:{
			columns:{
				[column_name]:{
					type,
					ordinal_position,
					default,
					is_nullable
				},...
			},
			primary_key:[column_name,...],
			foreign_keys:[
				{
					constraint_name,
					table_name, //will be parent table_name
					column_names,
					foreign_table_name,
					foreign_column_names,
				},...
			],
			target_of_foreign_keys:[
				{
					constraint_name,
					table_name,
					column_names,
					foreign_table_name,//will be parent table_name
					foreign_column_names,
				},...
			]
		}
	},...
}

How

This is done in a few steps:

extractDbSchema(opts) queries information_schema to provide a json schema representation of your mysql or postgres db (which you can also use in your code)
run(db,opts):
- extractDescriptionsFromMarkdown(path) extracts descriptions from the markdown file.
- makeMarkdownHtml(db,descriptions) reconstructs and updates a git-flavored markdown readme of your db from the extracted json & descriptions, preserving descriptions across rebuilds, with each table and column deep-linkable
- checkConventions(db,descriptions,opts) checks whether the current state of the db follows the desired rules

Why

Documentation - Being able to see an overview is desirable. Being able to point at something in conversation is helpful. Things not committed become folklore.
Total Freedom Is Not Always Desirable - Dev teams, especially those which suffer from high turnover, allow too much freedom in databases, which leads to local contradictions, which leads to ever-increasing mental overhead. Adding some reasonable rules can minimize the mental overhead necessary, and increase reliability.
Given the levels of restrictions and rigor placed on executed code, there are curiously few placed on everything else. Such freedom in a space can send the signal that equivalent rigor is not worthwhile here, when of course it still is.

Caveats

stored procedures, views, and enums are currently not considered, because they are not recommended.
the natural language processing module compromise to detect plurals is not perfect. Sometimes you might have to give it hints atop your setup file to interpret certain words as nouns which could act as verbs, like template in this case:

let nlp=require('compromise')//will be available if `db-linter` is installed
nlp('',{
	//word:'Noun'
	template:'Noun',
	//...
})

Example

Looks best & links all work if viewed as github-flavored markdown.

Automatically rebuilt with updates, retaining descriptions devs provide. Note all links are deep-linkable for referencing in conversation.

A 3 col-max TOC is on top, for dbs with many tables.

Note you can place anything outside the <!--DB-LINTER--> markers surrounding the added markup.

But only descriptions inside, as everything else is regenerated between them.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

db-linter

Setup

Rules

db

How

Why

Caveats

Example

`db`