sparse-geonames-ingest
v1.0.0
Published
scripts to generate sparse p2p archives for geonames data
Downloads
3
Readme
sparse-geonames-ingest
scripts to generate sparse p2p archives for geonames data
to search the data generated by this tool, use sparse-geonames-search
Sorting and loading results are written to disk to save memory.
This tool is capable of building an archive from allCountries.txt
, a 1.5GB archive (as of 2022)
with all geonames data. On my laptop this took 38 minutes and used several gigabytes of RAM.
cities500.txt
takes 10 seconds.
usage
usage: sparse-geonames-ingest COMMAND {OPTIONS}
sparse-geonames-ingest ingest {-i INFILE} -o OUTDIR
Load, sort, and build in one command.
sparse-geonames-ingest load {-i INFILE} -o OUTDIR
Read geonames newline-delimited text file from INFILE or "-" (stdin),
writing output to OUTDIR.
sparse-geonames-ingest sort -o OUTDIR
Sort loaded data from OUTDIR, writing results to OUTDIR.
sparse-geonames-ingest build -o OUTDIR (--size=SIZE)
Build sorted data from OUTDIR, writing results to OUTDIR.
Each file in the output is at most SIZE bytes.
api
var sgingest = require('sparse-geonames-ingest')
var ingest = sgingest(opts)
- opts.outdir - directory to use for output and temporary storage
- opts.recordsLimit - number of records to buffer before flushing to disk
- opts.lookupLimit - number of lookups to buffer before flushing to disk
- opts.limit - number of records and lookups to buffer before flushing to disk
ingest.write(line, cb)
Process one line of input from a geonames text file, calling cb(err)
when complete.
ingest.flush(cb)
Flush buffered records in memory to disk, calling cb(err)
when complete.
ingest.sort(cb)
Sort
ingest.build(opts={}, cb)
install
npm install sparse-geonames-ingest
license
bsd