@wmfs/supercopy
v1.52.0
Published
Takes a specifically-named directory structure of CSV files and conjures bulk insert, update and delete statements and applies them to a PostgreSQL database.
Downloads
4,424
Maintainers
Readme
supercopy
Takes a specifically-named directory structure of CSV files and conjures bulk insert, update and delete statements and applies them to a PostgreSQL database.
Install
$ npm install supercopy --save
Usage
const pg = require('pg')
const supercopy = require('supercopy')
// Make a new Postgres client
const client = new pg.Client('postgres://postgres:postgres@localhost:5432/my_test_db')
supercopy(
{
sourceDir: '/dir/that/holds/deletes/inserts/updates/and/upserts/dirs',
headerColumnNamePkPrefix: '.',
topDownTableOrder: ['departments', 'employees'],
client: client,
schemaName: 'my_schema',
truncateTables: true,
debug: true,
multicopy: false,
directoryNames: { ... }
},
function (err) {
// Done!
}
)
supercopy(options
, callback
)
Options
| Property | Type | Notes |
| -------- | ---- | ------ |
| sourceDir
| function
| An absolute path pointing to a directory containing action folders. See the File Structure section for more details.
| headerColumnNamePkPrefix
| string
| When conjuring an update
statement, Supercopy will need to know which columns in the CSV file constitute a primary key. It does this by expecting the first line of each file to be a header containing ,
delimited column names. However, column names prefixed with this value should be deemed a primary-key column. Only use in update CSV-file headers.|
| topDownTableOrder
| [string]
| An array of strings, where each string is a table name. Table inserts will occur in this order and deletes in reverse - use to avoid integrity-constraint errors. If no schema prefix is supplied to a table name, then it's inferred from schemaName
.
| client
| client
| Either a pg client or pool (something with a query()
method) that's already connected to a PostgreSQL database.
| schemaName
| string
| Identifies a PostgreSQL schema where the tables that are to be affected by this copy be found.
| truncateTables
| boolean
| A flag to indicate whether or not to truncate tables before supercopying into them
| debug
| boolean
| Show debugging information on the console
| multicopy
| boolean
| Enables 'sourceDir' to house many typical Supercopy 'sourceDir' shaped directories. Defaults to false.
| quote
| string
| Override the the default quote character, ". It isn't necessary to quote fields but occasionally (especially when importing JSON fields) you need to, and this option will help.
| directoryNames
| object
| Overrides the default directory names - see below.
File structure
The directory identified by the sourceDir
option should be structured in the following way:
/someDir
/inserts
table1.csv
table2.csv
/updates
table1.csv
table2.csv
/upserts
table1.csv
table2.csv
/deletes
table1.csv
OR IF USING MULTICOPY
/manyDirs
/someDir
/inserts
table1.csv
table2.csv
/someDir
/inserts
table1.csv
table2.csv
Notes
- The sub-directories here refer to the type of action that should be performed using CSV data files contained in it. Supported directory names are
insert
,update
,upsert
(try to update, failing that insert) anddelete
. - Directories are optional. A directory maybe missing or empty.
- The
directoryNames
option can be used to apply actions to directories if the names don't meet the above structure. EgdirectoryName : { 'inserts': 'new', 'deletes': 'old' }
would insert the contents of the directory namednew
and remove the contents of theold
directory. - The filename of each file should refer to a table name in the schema identified by the
schemaName
option. - The expected format of the .csv files is:
- One line per record
- The first line to be a comma delimited list of column names (i.e. a header record)
- For update and upsert files, ensure columns-names in the header record that are part of the primary key are identified with a
headerColumnNamePkPrefix
character. - All records to be comma delimited, and any text columns containing a
,
should be quoted with a"
. The csv-string package might help.
- Note that only primary key values should be provided in a 'delete' file.
Testing
Before running these tests, you'll need a test PostgreSQL database available and set a PG_CONNECTION_STRING
environment variable to point to it, for example:
PG_CONNECTION_STRING=postgres://postgres:postgres@localhost:5432/my_test_db
$ npm test