npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

csvgeocode

v2.3.1

Published

Bulk geocode addresses in a CSV.

Downloads

49

Readme

csvgeocode

For when you have a CSV with addresses and you want a lat/lng for every row. Bulk geocode addresses a CSV with a few lines of code.

The defaults are configured for Google's geocoder but it can be configured to work with any other similar geocoding service. There are built-in response handlers for Google, Mapbox, OSM Nominatim, Mapzen, and Texas A & M's geocoders (details below).

Make sure that you use this in compliance with the relevant API's terms of service.

Basic command line usage

Install globally via npm:

npm install -g csvgeocode

Use it:

$ csvgeocode path/to/input.csv path/to/output.csv --url "https://maps.googleapis.com/maps/api/geocode/json?address={{MY_ADDRESS_COLUMN_NAME}}&key=MY_API_KEY"

If you don't specify an output file, the output will stream to stdout instead, so you can stream the result as an HTTP response or do something like:

$ csvgeocode path/to/input.csv [options] | grep "greppin for somethin"

Options

You can add extra options when running csvgeocode. For example:

$ csvgeocode input.csv output.csv --url "http://someurl.com/" --lat CALL_MY_LATITUDE_COLUMN_THIS_SPECIAL_NAME --delay 1000 --verbose

The only required option is url. All others are optional.

--url [url] (REQUIRED)

A URL template with column names as Mustache tags, like:

http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{address}}.json?access_token=MY_API_KEY

https://maps.googleapis.com/maps/api/geocode/json?address={{address}}&key=MY_API_KEY

http://geoservices.tamu.edu/Services/Geocode/WebService/GeocoderWebServiceHttpNonParsed_V04_01.aspx?apiKey=MY_API_KEY&version=4.01&streetAddress={{address}}&city={{city}}&state={{state}}

https://search.mapzen.com/v1/search?api_key=MY_API_KEY&text={{address}}

If your addresses are broken up into multiple columns (e.g. a street_address column, a city column, and a state column), you can use them all together in a URL template:

https://maps.googleapis.com/maps/api/geocode/json?address={{street_address}},{{city}},{{state}}&key=MY_API_KEY

--handler [handler]

What handler function to process the API response with. Current built-in handlers are "google", "mapbox", "mapzen", "osm", and "tamu". Contributions of handlers for other geocoders are welcome! You can define a custom handler when using this as a Node module (see below).

Examples:

$ csvgeocode input.csv --url "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{MY_ADDRESS_COLUMN_NAME}}.json?access_token=123ABC" --handler mapbox

$ csvgeocode input.csv --url 'https://search.mapzen.com/v1/search?api_key=123ABC&text={{MY_ADDRESS_COLUMN_NAME}}' --handler mapzen

$ csvgeocode input.csv --url "http://geoservices.tamu.edu/Services/Geocode/WebService/GeocoderWebServiceHttpNonParsed_V04_01.aspx?version=4.01&streetAddress={{ADDR}}&city={{CITY}}&state={{STATE}}&apiKey=123ABC" --handler tamu

Default: "google"

--lat [latitude column name]

The name of the column that should contain the resulting latitude. If this column doesn't exist in the input CSV, it will be created in the output.

Default: Tries to automatically detect if there is a relevant existing column name in the input CSV, like lat or latitude. If none is found, it will use lat.

--lng [longitude column name]

The name of the column that should contain the resulting longitude. If this column doesn't exist in the input CSV, it will be created in the output.

Default: Tries to automatically detect if there is a relevant existing column name in the input CSV, like lng or longitude. If none is found, it will use lng.

--delay [milliseconds]

The number of milliseconds to wait between geocoding calls. Setting this to 0 is probably a bad idea because most geocoders limit how fast you can make requests.

Default: 250

--force

By default, if a lat/lng is already found in an input row, that will be kept. If you want to re-geocode every row no matter what and replace any lat/lngs that already exist, add --force. This means you'll hit API limits faster and the process will take longer.

--verbose

See extra output while csvgeocode is running.

$ csvgeocode input.csv --url "MY_API_URL" --verbose
160 Varick St,New York,NY
SUCCESS

1600 Pennsylvania Ave,Washington,DC
SUCCESS

123 Fictional St,Noncity,XY
NO MATCH

Rows geocoded: 2
Rows failed: 1
Time elapsed: 1.8 seconds

Using as a Node module

Install via npm:

npm install csvgeocode

Use it:

var csvgeocode = require("csvgeocode");

//stream to stdout
csvgeocode("path/to/input.csv",{
    url: "MY_API_URL"
  });

//write to a file
csvgeocode("path/to/input.csv","path/to/output.csv",{
    url: "MY_API_URL"
  });

You can add all the same options in a script, except for verbose.

var options = {
  "url": "MY_API_URL",
  "lat": "MY_SPECIAL_LATITUDE_COLUMN_NAME",
  "lng": "MY_SPECIAL_LONGITUDE_COLUMN_NAME",
  "delay": 1000,
  "force": true,
  "handler": "mapbox"
};

//stream to stdout
csvgeocode("input.csv",options);

//write to a file
csvgeocode("input.csv","output.csv",options);

csvgeocode runs asynchronously, but you can listen for two events: row and complete.

row is triggered when each row is processed. It passes a string error message if geocoding the row failed, and the row itself.

csvgeocode("input.csv",options)
  .on("row",function(err,row){
    if (err) {
      console.warn(err);
    }
    /*
      `row` is an object like:
      {
        first: "John",
        last: "Keefe",
        address: "160 Varick St, New York NY",
        employer: "WNYC",
        lat: 40.7267926,
        lng: -74.00537369999999
      }
    */
  });

complete is triggered when all geocoding is done. It passes a summary object with three properties: failures, successes, and time.

csvgeocoder("input.csv",options)
  .on("complete",function(summary){
    /*
      `summary` is an object like:
      {
        failures: 1, //1 row failed
        successes: 49, //49 rows succeeded
        time: 8700 //it took 8.7 seconds
      }
    */
  });

Using a custom geocoder

You can use any basic geocoding service from within a Node script by supplying a custom handler.

The easiest way to see what a handler should look like is to look at handlers.js.

The handler function is passed the body of an API response and should either return a string error message or an object with lat and lng properties.


csvgeocoder("input.csv",{
  url: "MY_API_URL",
  handler: customHandler
});

function customHandler(body) {
  //success, return a lat/lng
  if (body.result) {
    return {
      lat: body.result.lat,
      lng: body.result.lng
    };
  }

  //failure, return a string
  return "NO MATCH";
}

Contributing/tests

The tests for the Mapbox and TAMU geocoders both require API keys. To run those tests, you need those API keys in a .env file in the project's root folder that defines two environment variables like so:

MAPBOX_API_KEY=123ABC
TAMU_API_KEY=123ABC

Some Alternatives

To Do

  • Add the NYC geocoder as a built-in handler.
  • Support a CSV with no header row where lat and lng are numerical indices instead of column names.
  • Support both POST and GET requests somehow.

Credits/License

By Noah Veltman

Available under the MIT license.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions.

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.