@360-l/mongo-bulk-data-migration
v1.4.5
Published
MongoDB bulk data migration for node scripts
Downloads
3,944
Maintainers
Readme
MongoDB bulk data migration for NodeJs
Mongodb schema migration utility
About
MongoBulkDataMigration
is a 1-liner MongoDb migration utility for your 1-shot schema migrations.
It is fast and resume-able, handles rollback automatically.
Why MongoBulkDataMigration?
🚀 fast: built over Mongo Bulk operations to less stress the Mongo socket. Bulk is adapted to massive update as it groups write operations.
🏔️ scalable: working on 1 million / 1 billion documents is not an issue. Backup is done in a dedicated collection. Bulks can be throttled.
🔒 safe and easy: Backup and restore is done progressively by bulk. Whatever is fetched and projected is what is saved as backup. Script built over MBDM is explicit and focused.
🔄 resume-able: a script can be resumed if it crashes (disconnection). By design, it's safe to run it twice in update or rollback mode.
💝 minimal: A "unified" extension script is handle by the platform to write less code. Also, the test util
doRollbackAndAssertForInitialState
will simplify test writing for Jest and Chai frameworks.🏔️ Aggregate-ready: You can fetch using a simple query or an aggregation pipeline
🚫 Prerequisite to use MBDM
MBDM fits most needs, but not all needs. Prerequisite:
- MBDM can't insert new documents (you can only update or delete documents)
- MBDM can only update 1 collection at a time
- MBDM can only update a document once for a same migration ID (unless you clean backup manually in-between)
- queries should ideally not match an already updated document (for resume-ability)
✅ MBDM capabilities
Biggest features making MBDM powerful:
- MBDM can automatically rollback to the exact same state projected values:
- for
$set
operations - for
$unset
operations - for document deletion operations
- ... for anything more advanced, you can still rollback using a custom
rollback
function
- for
- MBDM accepts basic Mongo queries, or aggregation pipelines,
- MBDM can compute a document update operation in an async callback, with a defined
maxConcurrentUpdateCalls
, allowing you to call a complex asynchronous code. - MBDM makes chore script writing fast 🚀
Support logs
MBDM
📘 Getting Started
Install MBDM
Using npm
npm install --save-dev @360-l/mongo-bulk-data-migration
Run migration update()
and rollback()
MBDM expects a connection to your database, a id (string), and a target collection to process.
import { MongoBulkDataMigration } from "@360-l/mongo-bulk-data-migration";
// connect to db
// handle script input
const migration = new MongoBulkDataMigration({
db: mongoClient, // MongoClient established instance
collection: "myCol", // Collection where there will be an update
id: "uid_migration", // Required to rollback storage
... // see below for query/update
});
// Do the update
await migration.update();
// Or revert updated properties (only) in all updated documents
await migration.rollback();
MBDM does not provide CLI tool. You can consider to create some structural abstract code to run it like. As an example here:
node ./myscript.ts --action [update|rollback]
Simple $set example
This migration will set { total: 0 }
for every doc not having 0.
Note: rollback will automatically unset back total.
new MongoBulkDataMigration<Score>({
db,
id: 'scores_set_total',
collectionName: 'scores',
projection: {},
query: { total: { $exists: false } },
update: { $set: { total: 0 } },
});
⚙️ Options
new MongoBulkDataMigration({ ..., options: { ... } })
maxBulkSize
(default: 1000): Batch size of of updates to execute. 1000 is a good number. If your updated are huge, consider decreasing it to avoid running high memory. If documents are tiny, 10k might slightly improve performances on huge databases.dontCount
(default: false): will skip initial filtercount()
or aggregate$count
. Logs won't print the total progression if disabled.projectionBackupFilter
(default: none): filter properties to save and auto-rollback. Necessary if your update needs virtual fields.bypassRollbackValidation
andbypassUpdateValidation
(default: false): Will set validationLevel to "off" then to "moderate".throttle
(default: 0): amount of time to sleep between a bulk update. Use this to decrease database stress.continueOnBulkWriteError
(default: false): will continue the migration on the error in a bulk.
📕 Advanced usages
MBDM has support for async updates query, custom rollback, aggregation pipelines...
Simple $set with update callback and projection
This migration will sum 2 projected fields scoreA
and scoreB
for all documents (no filter).
Note: rollback will automatically set back scoreA
and scoreB
and reset total
.
new MongoBulkDataMigration<Score>({
db,
id: 'scores_total_new_field',
collectionName: 'scores',
projection: { scoreA: 1, scoreB: 1 },
query: {},
update: (doc) => {
$set: {
total: doc.scoreA + doc.scoreB;
}
},
});
Delete documents (update: DELETE_OPERATION
)
This migration will delete doc having negative total
.
Note: rollback will automatically restore full document.
import { MongoBulkDataMigration, DELETE_OPERATION } from "@360-l/mongo-bulk-data-migration";
...
new MongoBulkDataMigration<Score>({
db,
id: "delete_negative_total",
collectionName: "scores",
projection: {}, // Everything needs to be projected
query: { total: { $lt: 0 } },
update: DELETE_OPERATION,
});
Using aggregation pipeline (query
)
query
expects a query (object), or an aggregation pipeline (array).
Note: projection
won't do anything.
Example: update totalGames
of a corresponding table
import { MongoBulkDataMigration } from "@360-l/mongo-bulk-data-migration";
...
new MongoBulkDataMigration<Score>({
db,
id: "delete_negative_total",
collectionName: "scores",
projection: { "games.value": 1, totalGames: 1 },
query: [
{ $lookup: { as: "games", ... } },
{ $match: { "games.value": "xxx" } },
],
update: (doc) => ({
$set: {
totalGames: doc.games.value
}
}),
options: {
// Necessary to save only `totalGames`, and not auto-restore uncexisting `games` field
projectionBackupFilter: ["totalGames"]
}
});
Delete a collection (operation: DELETE_COLLECTION
)
The collection will be renamed to the backup collection. When you rollback, the collection will simply be renamed back
import { MongoBulkDataMigration, DELETE_COLLECTION } from "@360-l/mongo-bulk-data-migration";
...
const migration status = new MongoBulkDataMigration<Score>({
db,
id: "delete_collection_scores",
collectionName: "scores",
operation: DELETE_COLLECTION,
});
console.log(status); // { ok: 1 }
🧩 Ease testing
MBDM provides to simplify your tests files. Just writes what you expect, and the util will ensure the database is back to initia state.
It supports for Jest and Chai testing library (expect
).
Note: expectation is done on sorted docs by _id
it('should not modify pages with existing translations', async () => {
const docs = [{ _id: new ObjectId(), value: 'invalid' }];
await collection.insertMultiple(docs);
await dataMigration.update();
// Test what you expect
const updatedDocs = await collection.find({}).toArray();
expect(updatedDocs).toEqual([{ _id: new ObjectId(), value: 'valid' }]);
// Test rollback will work
await doRollbackAndAssertForInitialState(dataMigration, docs, { expect });
});