cht-couch2pg
v3.6.2
Published
Replicate CHT data from CouchDB to postgres database
Downloads
2
Maintainers
Keywords
Readme
CHT couch2pg
Create read-only replicas of CouchDB data inside PostgresSQL.
The focus is specifically on CHT application data currently stored in CouchDB. If you are looking to have a read-only replica of CouchDB data for your application that isn't the CHT, consider couch2pg.
This version is built for medic/cht-core#3.0.0 and above. For replicating data from earlier versions, see the 2.0.x branch and associated tags.
Prerequisites
Clone
All steps below require you to have a local clone of the repo.
git clone https://github.com/medic/cht-couch2pg.git
Node and npm
You will need to install the following to run locally, but not for docker:
- Node.js 8.11.x up to 12.x.x. Must be an LTS release. LTS is designated with an even major version number.
- npm 6.x.x above
NOTE: Currently, cht-couch2pg only runs in node versions 8, 10 and 12. Later versions of node have been known to fail.
Database setup
cht-couch2pg
supports PostgreSQL 9.4 and greater. The user passed in POSTGRESQL_URL
needs to have full creation rights on the database in POSTGRES_DB_NAME
.
Running
Locally with environment variables
cd
into it this repo's directory where you cloned it.- Install dependencies:
npm ci
- Export these four variables with the values you need.:
export POSTGRESQL_URL=postgres://postgres:postgres@localhost:15432/postgres export COUCHDB_URL=https://admin:pass@localhost:5984/medic export COUCH2PG_DOC_LIMIT=1000 export COUCH2PG_RETRY_COUNT=5
- Run:
node .
If you want to set and save all possible variables:
cd
into it this repo's directory where you cloned it.- Copy
sample.env
tocouch2pg.env
- Edit
couch2pg.env
to have all the variables you need. Note thatPOSTGRESQL_URL
shouldn't be edited as it's defined by the variables above it. Be sure to changePOSTGRES_SERVER_NAME
to where ever your postgress server is running. If it's local, then uselocalost
. The default value ofpostgres
won't work. - Run:
. ./couch2pg.env&&node .
In docker-compose
The simplest way to run with docker-compose
is to specify the CouchDB instance that your CHT is using. The compose file will then create a dockerized PostgresSQL instance, connect to the CouchDB server and proceed to download all the data to the PostgresSQL instance:
cd
into it this repo's directory where you cloned it.- When starting the docker compose services, you need to set the URL for CouchDB in the
COUCHDB_URL
env variable. This URL needs to be reachable the docker container (ie notlocalhost
). Ensuring you're in the same directory where you ran thecurl
call in the prior step, run:export COUCHDB_URL=https://medic:[email protected]:8442/medic docker-compose up
- Connect to the PostgresSQL instance with login
cht_couch2pg
, passwordcht_couch2pg_password
and databasecht
. As these are insecure, do not use with production data. See below for how to harden these.
If you want to set all possible variables, or be able to store the variables in configuration file:
cd
into it this repo's directory where you cloned it.- Copy
sample.env
tocouch2pg.env
- Edit
couch2pg.env
to have all the variables you need. Note thatPOSTGRESQL_URL
shouldn't be edited as it's defined by the variables above it. If you're using the built-in PostgresSQL server, be sure to keep thePOSTGRES_SERVER_NAME
set topostgres
as this is the correct internal service name in docker. Be sure to also set secure passwords for all PostgresSQL accounts. - Run docker and specify the environment file you just edited:
docker-compose --env-file couch2pg.env up
- To connect to the PostgresSQL instance, use the server from
POSTGRES_SERVER_NAME
, use login fromCOUCH2PG_USER
, password fromCOUCH2PG_USER_PASSWORD
and the database fromPOSTGRES_DB_NAME
.
Interactive
Run it locally in interactive mode with node . -i
and you will see the ASCII art:
____ _ _ _____ ____ _ ____ ____
/ ___| | | | | |_ _| / ___| ___ _ _ ___ | |__ |___ \ | _ \ __ _
| | | |_| | | | _____ | | / _ \ | | | | / __| | '_ \ __) | | |_) | / _` |
| |___ | _ | | | |_____| | |___ | (_) | | |_| | | (__ | | | | / __/ | __/ | (_| |
\____| |_| |_| |_| \____| \___/ \__,_| \___| |_| |_| |_____| |_| \__, |
|___/
Instead of environment variables, you will be prompted to answer the following questions. For each question, you will be given suggestions for an answer:
- Enter CHT's couch url
- Enter cht-couch2pg postgres url
- Select the number of minutes interval between checking for updates
- Select the number of documents to grab concurrently. Increasing this number will cut down on HTTP GETs and may improve performance, decreasing this number will cut down on node memory usage, and may increase stability.
- Select the number of document ids to grab per change limit request. Increasing this number will cut down on HTTP GETs and may improve performance, decreasing this number will cut down on node memory usage slightly, and may increase stability.
- Select whether or not to have verbose logging.
- Select how many times to internally retry continued unsuccessful runs before exiting. If unset cht-couch2pg will retry indefinitely. If set it will retry N times, and then exit with status code 1 indefinitely
- Select the number of documents to grab concurrently from the users-meta database. Increasing this number will cut down on HTTP GETs and may improve performance, decreasing this number will cut down on node memory usage, and may increase stability. These documents are larger so set a limit lower than the docLimit
Supported environment variables
All three methods of running cht-couch2pg listed above use these variables:
COUCHDB_URL
- CouchDB instance URL with no trailing slash after/medic
, format:https://[user]:[password]@localhost:[port]/medic
COUCH2PG_SLEEP_MINS
- Number of minutes between synchronization. It defaults to60
.COUCH2PG_DOC_LIMIT
- Number of documents cht-couch2pg fetches from CouchDB everytime. Suggested:1000
COUCH2PG_RETRY_COUNT
- Number of times cht-couch2pg will retry synchronizing documents from CouchDB after experiencing an errorCOUCH2PG_USERS_META_DOC_LIMIT
- Number of documents to grab concurrently from the users-meta database. These documents are larger so set a limit lower than the docLimit. It defaults to50
.COUCH2PG_CHANGES_LIMIT
- The number of document ids to fetch per change limit request. Suggested:100
COUCH2PG_USER
- The user that couch2pg will use to login in to the CouchDB server. Suggestedcht_couch2pg
COUCH2PG_USER_PASSWORD
- The password that couch2pg will use to login in to the CouchDB server.POSTGRES_SERVER_NAME
- The server or IP where the postgres server is. This should be set topostgres
when using docker.POSTGRES_USER_NAME
- The admin user for postgres in docker. Suggested:postgres_root
POSTGRES_PASSWORD
- The admin password for postgres in docker.POSTGRES_DB_NAME
- The name of the PostgreSQL database to sync to.. Suggested:cht
POSTGRES_PORT
- Port where PostgresSQL can be found. Suggested:5432
POSTGRESQL_URL
- PostgresSQL instance URL, format:postgres://[user]:[password]@localhost:[port]/[database name]
SYNC_DB_MEDIC
- Whether to sync the content of themedic
database. Suggested:true
SYNC_DB_SENTINEL
- Whether to sync the content of themedic-sentinel
database. Suggested:true
SYNC_DB_USER_META
- Whether to sync the content of themedic-users-meta
database. Suggested:true
SYNC_DB_LOGS
- Whether to sync the content ofmedic-logs
database. Suggested:true
SYNC_DB_USERS
- Whether to sync the CouchDB_users
database without security information. Suggested:true
Known issues
Error "Checksum failed for migration ..." when upgrading from 3.2.0 to latest
An SQL migration file was changed in version 3.2.0. This made upgrades from 3.1.x impossible, with the process crashing upon startup after the upgrade. See more details about the error.
This was fixed in version 3.2.1, by reverting the changes made to the migration file. Fresh installations of 3.2.0 should execute this SQL before upgrading:
UPDATE xmlforms_migrations
SET md5 = 'e0535c9fe3faef6e66a31691deebf1a8'
WHERE version = '201606200952' AND
md5 = '40187aa5ee95eda0e154ecefd7512cda';
See more details about the error in #78.
Error installing deps ERR! ... node-pre-gyp install --fallback-to-build
When installing Node.js dependencies locally or building the docker image, you might get an error like:
...
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] install: `node-pre-gyp install --fallback-to-build`
It is probably related to a gcc library that is failing with some versions of Node and npm, try with Node 10 without updating the npm
version that comes with it.
Tests
Run tests with docker-compose:
docker-compose -f docker-compose.test.yml build cht-couch2pg
docker-compose -f docker-compose.test.yml run cht-couch2pg grunt test
Run tests in interactive watch mode with: docker-compose -f docker-compose.test.yml run cht-couch2pg npm run watch
Run entrypoint script tests with
docker-compose -f docker-compose.test.yml run cht-couch2pg ./tests/bash/bats/bin/bats /app/tests/bash/test.bats