pelias-blacklist-stream
v1.3.0
Published
Pelias document blacklist stream
Downloads
2,259
Readme
This repository is part of the Pelias project. Pelias is an open-source, open-data geocoder originally sponsored by Mapzen. Our official user documentation is here.
Pelias Blacklist Stream
This package provides a configuration-driven approach to removing specific records from an import stream.
It's particularly helpful where algorithmic deduplication fails or for when data is erroneous and needs to be replaced with an alternative.
Installation
$ npm install pelias-blacklist-stream
Usage
The blacklist stream is intended to be used by pipelines passing objects generated by pelias/model.
You can specify which records are omitted from the build by providing their globally-unique-id (GID).
GIDs can be found in the results served by pelias/api or by calling getGid()
on a Pelias Model object.
Using a javascript map as the blacklist
const blacklistStream = require('pelias-blacklist-stream');
const blacklist = {
"openaddresses:address:us/tx/libery:377e64dd81884dbe": "1800 Mlk, Liberty, TX, USA",
"openaddresses:address:us/fl/statewide:bee100ffcc77c699": undefined
};
const stream = blacklistStream( blacklist );
The stream will now remove any documents which match either of the GIDs openaddresses:address:us/tx/libery:377e64dd81884dbe
or openaddresses:address:us/fl/statewide:bee100ffcc77c699
.
The values are optional, you can specify a human-readable comment for debugging.
Using blacklist files specified from Pelias Config
If no arguments are provided when calling blacklistStream()
, it will load your local pelias/config.
If your config contains entries in the imports.blacklist.files
array then each file will be loaded from disk, merged and used as the blacklist.
const blacklistStream = require('pelias-blacklist-stream');
const stream = blacklistStream(); // no arguments specified
The relevant parts of the pelias config file, usually located at ~/pelias.json
:
{
"imports": {
"blacklist": {
"files": [
"/tmp/blacklist_file_one",
"/tmp/blacklist_file_two"
]
}
}
}
Blacklist file format
Blacklist files stored on disk can have any file extension (or none).
Each line of the file should contain one GID and optionally one comment, lines are separated by a '\n' newline character.
An example of a blacklist file without comments:
openaddresses:address:us/tx/libery:377e64dd81884dbe
openaddresses:address:us/fl/statewide:bee100ffcc77c699
An example of a blacklist file with debugging comments:
openaddresses:address:us/tx/libery:377e64dd81884dbe # 1800 Mlk, Liberty, TX, USA
openaddresses:address:us/fl/statewide:bee100ffcc77c699
If the line contains a '#' symbol then anything after the '#' will be considered a comment. Using another '#' in your comment string is not supported.
The parser will String.trim()
whitespace but you must take care to provide the correct letter casing.
NPM Module
The pelias-blacklist-stream
npm module can be found here:
https://npmjs.org/package/pelias-blacklist-stream
Contributing
Please fork and pull request against upstream master on a feature branch.
Pretty please; provide unit tests and script fixtures in the test
directory.
Running Unit Tests
$ npm test
Continuous Integration
Travis tests every release against all supported Node.js versions.
Versioning
We rely on semantic-release and Greenkeeper to maintain our module and dependency versions.