s3-file-scan-cat
v1.2.8
Published
A utility to recursively scan a specified folder in S3 and concat JSON files within a single folder into a single gzip file.
Downloads
58
Readme
s3-file-scan-cat
A utility library that exists to concatenate multiple small JSON files into a single compressed gunzip file.
Install
This is a Node.js module available through the npm registry.
Before installing, download and install Node.js. Node.js 0.6 or higher is required.
Installation is done using the
npm install
command:
$ npm install s3-file-scan-cat
Example
Configuration
Configuration is divided between three main sections, AWS S3 configuration (bucket
, etc.), AWS secrets (accessKeyId
, secretAccessKey
), and the scanner.
Breaking these three sections into two JSON files is recommended. One files contains the credentials for AWS and should never be committed to a repository and the second file contains everything else and can be committed to a repository depending on your deployment strategy.
Example: secrets.json
{
"aws": {
"accessKeyId": "_secret_key_",
"secretAccessKey": "_secret_access_key_"
}
}
Example: appConfig.json
{
"aws" : {
"s3": {
"bucket": "bucket-name",
"scannerPrefix": "src-prefix",
"destinationPrefix": "dest-prefix"
}
},
"scanner": {
"logLevel": "info",
"partitionStack" : [
"year",
"month",
"day",
"part-04",
"part-05"
],
"limits" : {
"scanPrefixForPartitionsProcessLimit": 10
"s3ObjectBodyProcessInProgressLimit": 500
"maxFileSizeBytes": 134217728
},
"bounds": {
"startDate": "2020-01-01",
"endDate": "2020-01-01"
}
}
}
Performing the scan
import * as fs from 'fs';
import { AWSSecrets, S3FileScanCat, ScannerConfig } from 's3-file-scan-cat';
const scannerConfig = JSON.parse(fs.readFileSync('./config/manager_config.json').toString('utf8')) as ScannerConfig
const awsSecrets = JSON.parse(fs.readFileSync('./config/private/secrets.json').toString('utf8')).aws as AWSSecrets
const s3Scanner = new S3FileScanCat(scannerConfig.aws.s3.useAccelerateEndpoint, scannerConfig.scanner, awsSecrets)
s3Scanner
.scanAndProcessFiles(scannerConfig.aws.s3.bucket, scannerConfig.aws.s3.scannerPrefix, scannerConfig.aws.s3.destinationPrefix)
.then(() => {
process.exit(0)
})
.catch((error) => {
console.error(`Failed: ${error}`)
process.exit(-1)
})