nosnaplet
v1.0.2
Published
A tool similar to Snaplet for MongoDB that anonymizes production data for development environments, ensuring GDPR compliance. Ideal for securely replicating production databases for development and testing purposes.
Downloads
37
Maintainers
Readme
noSnaplet (Snaplet for MongoDB)
Introduction
I wanted a tool similar to Snaplet for MongoDB to anonymize production data for development environments in order to comply with GDPR. This package allows you to import, anonymize, and manage your MongoDB data in a way that ensures privacy and data protection across different environments.
This package is still under development, so there may be bugs, especially in the inter-collection link functionality.
Documentation
nosnaplet documentation site - COMING SOON.
Getting started
To get started with noSnaplet, follow these steps:
Install MongoDB tools for mongoexport
Install noSnaplet
npm install -g nosnaplet
- Run noSnaplet CLI by running following command:
npx nosnaplet fakesnap
Commands
Test your connection to database
npx nosnaplet tryconnect
You will be prompted to insert your MongoDB connection URI.
Example prompt:
Enter the connection URI of the mongo database to replicate (production data). Please include '?authSource=admin' in the URI. Here's an example: mongodb://root:example@localhost:27017/?authSource=admin. Enter URI : mongodb://userdb:[email protected]:27017/?authSource=admin
Run export to MongoDB only
npx nosnaplet snapshot
Insert your MongoDB connection URI when prompted.
Upon completion, you will see: : Schemas and links written to .schema-directory
You have two folders created. .output-directory and .schema-directory
For more information, you can consult the structure of these folders in the relevant section.
Only import and anonymize into MongoDB
npx nosnaplet faked
You will be prompted to insert your MongoDB connection URI. Upon completion, you will see the following message:
All document links have been updated.
Folder .schema-directory deleted.
Folder .output-directory deleted.
Closing MongoDB connection.
The logs will also indicate which databases received anonymized data.
If you change the directory after taking a snapshot of the production database :
Copy the .output-directory and .schema-directory generated by the export command into your current directory.
Run all
You can execute both the snapshot and anonymization processes in one step using the following command:
npx nosnaplet fakesnap
This command runs both npx nosnaplet snapshot (which exports the data) and npx nosnaplet faked (which imports and anonymizes the data) sequentially. It's a convenient way to streamline the process without having to run the commands separately.
File Structure
.
├── .output-directory/
│ ├── db1/
│ │ ├── collection1.json
│ │ └── collection2.json
│ ├── db2/
│ │ └── collection1.json
│ └── ...
└── .schema-directory/
├── interfaces.ts
└── links.json
Into output-directory you can see a json file to the collection with your data.
interfaces.ts
If you need to customize the types or structure of your data, you can modify the interfaces.ts
file located in the .schema-directory
. This file defines the schema of your MongoDB collections.
Example
Here is a section of the original interfaces.ts file:
Original :
import { ObjectId } from 'mongodb';
interface MS_nosnapletcode_prospectpulse {
entrepriseprospects: {
_id: ObjectId;
nom: string;
}
}
Customization
You can customize the schema to fit your needs. For instance, if you want to change the type of the nom field from string to number, you would modify the interface as shown below.
import { ObjectId } from 'mongodb';
interface MS_nosnapletcode_prospectpulse {
entrepriseprospects: {
_id: ObjectId;
nom: number;
}
}
Supported Types
The following types are supported within the interfaces.ts file:
- string: Represents text data.
- number: Represents numeric values, including integers and floats.
- boolean: Represents true/false values.
- Date: Represents date and time values.
- ObjectId: Represents a MongoDB ObjectId, used as a unique identifier for documents.
- Array: Represents an array of values, which can be of any type (e.g., string[], number[]).
- Nested Objects: You can define nested objects with their own properties and types.
links.json
The links.json file located in the .schema-directory defines the relationships between different collections across databases in your MongoDB setup. It serves as a mapping that helps you maintain referential integrity when importing or anonymizing data. Essentially, it specifies how documents in one collection are linked to documents in another collection through fields that act as references, typically using ObjectId values.
Structure of links.json
The links.json file consists of an array of link objects. Each link object describes a relationship between a field in one collection and a corresponding collection in another database.
Here is a breakdown of the fields within each link object:
- field: The name of the field in the source collection that holds the reference to another collection.
- fromDatabase: The name of the source database where the reference field is located.
- fromCollection: The name of the source collection where the reference field is located.
- toDatabase: The name of the target database that contains the collection you are linking to.
- toCollection: The name of the target collection that the field is referencing.
- type: The type of the reference, typically ObjectID, indicating that the field holds MongoDB ObjectId values that reference documents in the target collection.
Example of links.json Here’s the content of your links.json file:
{
"links": [
{
"field": "createdBy",
"fromDatabase": "database1",
"fromCollection": "entrepriseprospects",
"toDatabase": "database2",
"toCollection": "users",
"type": "ObjectID"
},
{
"field": "organisation",
"fromDatabase": "database1",
"fromCollection": "entrepriseprospects",
"toDatabase": "database2",
"toCollection": "organisations",
"type": "ObjectID"
}
]
}
Explanation :
- Field: createdBy
- From Database: database1
- From Collection: entrepriseprospects
- To Database: database2
- To Collection: users
- Type: ObjectID
This link indicates that the createdBy field in the entrepriseprospects collection of the database1 database references a document in the users collection of the database2 database. The reference is an ObjectId.
How links.json Is Used
After anonymizing and importing data, the script can use this file to update the ObjectId fields in the source collection to point to the correct documents in the target collection.
Customization
You can add or modify entries in links.json if you need to define additional relationships between collections in your databases.
Ensure that the field, fromDatabase, fromCollection, toDatabase, toCollection, and type properties correctly reflect the schema and relationships in your MongoDB setup.
License
This project is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).
You are free to share and redistribute the material in any medium or format under the following terms:
- Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.
- NonCommercial: You may not use the material for commercial purposes.
- NoDerivatives: If you remix, transform, or build upon the material, you may not distribute the modified material.
For the full license text, please visit CC BY-NC-ND 4.0 License.