etl-toolkit
v0.2.0
Published
general components for managing the retrieval and processing of data
Downloads
1
Readme
ETL toolkit
General components for managing the retrieval and processing of data
Components
Queues
populateIds
may be used to add IDs to the etlStore from a paged source.
populateRecordsFromIds
may be used to populate records from the IDs in the etlStore.
ETL Store
The etlStore
manages the state of the ETL including IDs, loaded records, and a list of errored IDs.
The store can persist state to the local file system during queue processing enabling an ETL to continue after interruption.
Environment variables
Environment variables are expected to be managed by the environment in which the application is being run. This is best practice as described by twelve-factor.
Environment variables are used to set application level settings for each environment.
| Variable | Description | Default | Required |
| :--------------------------------- | :--------------------------------------------------------- | ---------------------- | :------- |
| LOG_LEVEL
| log level | Depends on NODE_ENV
| |
| NODE_ENV
| node environment | development | |
| OUTPUT_FILE
| Filename saved to azure | etl-data | |
| HITS_PER_HOUR
| Maximum number of times to call a queue operation per hour | 20000 | |
| ETL_NAME
| Name used in Bunyan logger | etl-toolkit | |