cache_feeder
v2.0.1
Published
This script helps to efficiently set a huge number (> 1 million) of key-value pairs from CSV file to cache (Redis)
Downloads
3
Maintainers
Readme
cache_feeder
This node package reads key-value pairs from a CSV file and inserts it into Redis using streams.
Application
Use this package if you want to set key-value pairs from a CSV file which contains huge number of rows (N_rows > 100,000). The script makes use of streams and a csv parser to read the file and make requests to cache in batches to reduce round trip time.
Why ?
At the start of a project or even in an ongoing project, sometimes it is required to insert a lot of key-value pairs into cache for the program to work properly. Inserting these (> 1 million) keys into cache by loading it into memory at once is not a good option and mostly results in app crashes. The better way to approach this problem is through streams.
Supported cache
- Redis
- Aerospike (In progress)
Installation
npm install --save cache_feeder
Github repository
https://github.com/anuj3918/cache_feeder.git
Code example
// Setup options for library and cache
const options = {
cacheConfig: {
type: 'redis',
host: '127.0.0.1',
port: 6379
},
libSettings: {
commandsPerBatch: 40,
logTime: true,
logProcesses: true,
keyColName: 'key',
valueColName: 'value'
}
};
// Requires cache_feeder and initialise it, cf here is a Promise object
const cf = require('./index')(options);
// Starts setting key value pairs to cache, can use async-await here as well
cf('./MOCK_DATA.csv')
.then(result => {
console.log(result);
});
Details of 'options' to be passed
Below are the default values of options with details
const options = {
cacheConfig: {
type: 'redis', // Type of cache
host: '127.0.0.1', // Host of cache
port: 6379 // Port of cache
},
libSettings: {
commandsPerBatch: 40, // Number of set commands to send in one batch request
logTime: true, // Logs the total time taken on terminal
logProcesses: true, // Logs the tasks performed on terminal
keyColName: 'key', // Name of column header to be set as key name
valueColName: 'value' // Name of column header to be set as value name
}
};
Sample CSV file
first_name,email
anuj,[email protected]
gupta,[email protected]
To do
- Support for Aerospike.
- Adding more flexibility while creating connections to cache.
- Log any failures while inserting.
- Retry for only the failures keys encountered previously.
- Load testing metrics to be added in Readme.md
- Pausing and resuming stream while reconnecting to cache.
Contributions
You can add any of the above features into this package and create a pull request. For any further queries, write to [email protected]