npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

dpm2

v0.4.0

Published

Like npm but for data packages!

Downloads

6

Readme

dpm2

Like npm but for data packages!

NPM

Usage:

##CLI

$ dpm --help
Usage: dpm <command> [options] where command is:
  - cat       <datapackage name>[@<version>]
  - get       <datapackage name>[@<version>] [-f, --force] [-c, --cache]
  - clone     <datapackage name>[@<version>] [-f, --force]
  - install   <datapackage name 1>[@<version>] <datapackage name 2>[@<version>] ... [-c, --cache] [-s, --save] [-f, --force]
  - publish
  - unpublish <datapackage name>[@<version>]
  - adduser
  - owner <subcommand> where subcommand is:
    - ls  <datapackage name>
    - add <user> <datapackage name>
    - rm  <user> <datapackage name>[@<version>]
  - search [search terms]

Publishing and getting data packages

Given a data package:

$ cat package.json

{
  "name": "mydpkg",
  "description": "my datapackage",
  "version": "0.0.0",
  "keywords": ["test", "datapackage"],

  "resources": [
    {
      "name": "inline",
      "schema": { "fields": [ {"name": "a", "type": "string"}, {"name": "b", "type": "integer"}, {"name": "c", "type": "number"} ] },
      "data": [ {"a": "a", "b": 1, "c": 1.2}, {"a": "x", "b": 2, "c": 2.3}, {"a": "y", "b": 3, "c": 3.4} ]
    },
    {
      "name": "csv1",
      "format": "csv",
      "schema": { "fields": [ {"name": "a", "type": "integer"}, {"name": "b", "type": "integer"} ] },
      "path": "x1.csv"
    },
    {
      "name": "csv2",
      "format": "csv",
      "schema": { "fields": [ {"name": "c", "type": "integer"}, {"name": "d", "type": "integer"} ] },
      "path": "x2.csv"
    }
  ]
}

stored on the disk as

$ tree
.
├── package.json
├── scripts
│   └── test.r
├── x1.csv
└── x2.csv

we can:

$ dpm publish
dpm http PUT http://registry.standardanalytics.io/mydpkg/0.0.0
dpm http 201 http://registry.standardanalytics.io/mydpkg/0.0.0
+ [email protected]

and reclone it:

$ dpm clone mydpkg
dpm http GET http://registry.standardanalytics.io/mydpkg?clone=true
dpm http 200 http://registry.standardanalytics.io/mydpkg?clone=true
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/debug
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/debug
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
.
└─┬ mydpkg
  ├── package.json
  ├─┬ scripts
  │ └── test.r
  ├── x1.csv
  └── x2.csv

But to save space or maybe because you just need 1 resource, you can also simply ask to get a package.json where all the resource data have been replaced by and URL.

$ dpm get mydpkg
dpm http GET http://registry.standardanalytics.io/mydpkg
dpm http 200 http://registry.standardanalytics.io/mydpkg
.
└─┬ mydpkg
  └── package.json

For instance (using jsontool)

$ cat mydpkg/package.json | json resources | json -c 'this.name === "csv1"' | json 0.url

returns:

http://registry.standardanalytics.io/mydpkg/0.0.0/csv1

Then you can consume the resources you want with the module data-streams.

On the opposite, you can also cache all the resources data (including external URLs) in a standard directory structure, available for all the data packages stored on the registry.

$ dpm get mydpkg --cache
dpm http GET http://registry.standardanalytics.io/mydpkg
dpm http 200 http://registry.standardanalytics.io/mydpkg
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/inline
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/inline
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
.
└─┬ mydpkg
  ├── package.json
  └─┬ data
    ├── inline.json
    ├── csv1.csv
    └── csv2.csv

Each resources of package.json now have a path property. For instance

$ cat mydpkg/package.json | json resources | json -c 'this.name === "csv1"' | json 0.path

returns

data/csv1.csv

Installing data packages as dependencies of your project

Given a package.json with

{
  "name": "test",
  "version": "0.0.0",
  "dataDependencies": {
    "mydpkg": "0.0.0"
  }
}

one can run

$ dpm install
dpm http GET http://registry.standardanalytics.io/versions/mydpkg
dpm http 200 http://registry.standardanalytics.io/versions/mydpkg
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0
.
├── data_modules
└─┬ mydpkg
  └── package.json

Combined with the --cache option, you get:

$ dpm install --cache
dpm http GET http://registry.standardanalytics.io/versions/mydpkg
dpm http 200 http://registry.standardanalytics.io/versions/mydpkg
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/inline
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
dpm http GET http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/inline
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv1
dpm http 200 http://registry.standardanalytics.io/mydpkg/0.0.0/csv2
.
├── data_modules
└─┬ mydpkg
  ├── package.json
  └─┬ data
    ├── inline.json
    ├── csv1.csv
    └── csv2.csv

dpm aims to bring all the goodness of the npm workflow for your data needs. Run dpm --help to see the available options.

Using dpm programaticaly

You can also use dpm programaticaly.

var Dpm = require('dpm2);
var dpm = new Dpm(conf);

See bin/dpm for examples.

Using dpm with npm

dpm use the dataDependencies property of package.json and store the dependencies in a data_modules/ directory so it can be used safely, without conflict as a post-install script of npm.

Registry

By default, dpm uses our CouchDB powered data registry hosted on cloudant.

Why dpm2 ?

There is already a dpm being developed here but it leverages npm and the npm registry.

License

MIT