npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

openaq-quality-checks

v1.1.4

Published

CLI for adding flags to OpenAQ data

Downloads

9

Readme

OpenAQ Quality Checks

OpenAQ Quality Checks is a command line interface for flagging potentially invalid air quality measurements.

Have an OpenAQ data quality concern or experience you would like to share? Please add it to the OpenAQ Community: What is your OpenAQ data quality experience? issue!

Use

Prerequisites

Install

nvm use 8.9.4
npm install openaq-quality-checks -g

Develop

git clone https://github.com/openaq/openaq-quality-checks
cd openaq-quality-checks
nvm use
yarn install
yarn test

Configuration

openaq-quality-checks expects a list of items, either in json or csv.

There are 2 modes of configuration: config file and command line arguments.

1. Config File

The config file configures the flags. It defines which checks should be run, what values should be flagged, and what string to use for each flag.

A set of default flags are configured in config.yml. The default flags are:

  • E flags the value -999
  • N flags negative values
  • R flags repeating values, grouped by coordinates and ordered by date.

This default config file can be overriden using the --config <file.yml> argument, which should point to a yml file which has the following structure:

keyOne: # Arbitrary identifier for the flag, e.g. 'errors'. Useful for merging with the default configuration.
  flag: Any string, e.g. E
  type: One of exact|set|range|repeats
  # Depending on the type, other values may be included. See lib/flagger.js for what can be configured.
keyTwo:
  # ...

This configuration is merged with the default configuration, overriding fields that exist and adding fields that do not exist.

2. Commmand Line Arguments

Command line arguments configure

  • data input (defaults to STDIN)
  • input and output data format (defaults to json)
  • flags to skip (defaults to none), and,
  • which flags should be used to remove data from the output (default none).
$ quality-check --help
Usage: index.js [options]

Options:
  --version         Show version number                                [boolean]
  --infile          Input file. Should be the same format as input-format (which
                    default to json).
  --input-format    Input format, can be csv or json. Defaults to json.
  --ouptput-format  Output format, can be csv or json. Defaults to json.
  --skip            Comma-separated list of flags to skip.
  --remove          Comma-separated list of flags to use in removing data from
                    output.
  --remove-all      Removes all flagged data.
  --config          Config file to override default config.
  -h, --help        Show help                                          [boolean]

Examples:
  index.js --infile foo.json  Flags the contents of a file and writes to stdout.
  cat foo.json | index.js     Flags the contents of stdin and writes to stdout.

copyright 2018

Example Commands

Read and output JSON

Note: Commands below require jq, but jq is just for pretty printing json. If you don't have jq installed, remove the trailing | jq .

cat examples/addis-ababa-20180202.json | quality-check | jq .
# or
quality-check --infile examples/addis-ababa-20180202.json | jq .

Read and output CSV

cat examples/addis-ababa-20180202.csv | quality-check --input-format csv --output-format csv
# or
quality-check --infile examples/addis-ababa-20180202.csv --input-format csv --output-format csv

Override the default configuration

quality-check --infile examples/addis-ababa-20180202.json --config tests/test-config.yml | jq .

Skip the 'N' and 'R' flags

quality-check --infile examples/addis-ababa-20180202.json --skip N,R | jq .

Remove all errors

quality-check --infile examples/addis-ababa-20180202.json --remove E | jq .

Remove all flagged items

quality-check --infile examples/addis-ababa-20180202.json --remove-all | jq .

Using the API call

curl 'https://api.openaq.org/v1/measurements?location=US%20Diplomatic%20Post:%20Addis%20Ababa%20School&date_from=2018-02-02&date_to=2018-02-06&limit=10' | jq '.results' | quality-check | jq .

Using a different data source

The tool was built with OpenAQ in mind but also to be flexible to other data sources. For example, if you want to analyze aggregated world news using reddit's worldnews subreddit, you might want to flag posts from unknown news organizations.

Using a config like the one in examples/worldnews-config.yml, e.g.:

# examples/worldnews-config.yml
unknown_sources:
  flag: UKNOWN_SOURCE
  type: set
  values: ["theguardian.com", "bbc.co.uk", "bloomberg.com", "bbc.com", "reuters.com", "npr.org", "independent.co.uk", "cnn.com"]
  includes: 'false'
  valueField: 'data.domain'

We can flag all unknown news organizations:

echo $(curl -H "User-Agent: laptopterminal" https://www.reddit.com/r/worldnews.json?limit=50) | \
  jq '.data.children' | \
  quality-check --config examples/worldnews-config.yml | \
  jq '.'