npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

node-idman

v1.0.3

Published

Identity Management for Git Repositories

Downloads

9

Readme

idman

Map inconsistent git developer metadata to real identities

const idman = require('node-idman');
repoStats = idman('/path/to/repo/');
// => see sample-output.json

The above perform the default identity merge for a repo. An alternative merge algorithm may be specified with the optional second argument. For non-default algorithm, ptionally additional arguments may be provided as vararg. Returns the author and committer of every commit in that repository according to the merged identities, and the identities themselves (example output).

This is actually just a fork of the original idman with some wrapper code added and some some unimportant files removed. When the original idman changes, this repo should still be compatible and able to pull the changes (except for this readme maybe).

Requirements

  • git (obviously)
  • perl 5.16 or higher (git depends on perl, so you probably have it already)
  • node (obviously)
  • some algorithms may require additional things

Output

See sample-output.json for an example. It's the output for this here repository.

The idman output will be a JSON object. It contains the following keys:

identities

An array of identities, each representing an individual contributor. Each identity is a list of [name, e-mail address] tuples.

commits

An object representing all commits in the repository, keyed by their hash. Each individual commit contains the following keys:

  • author, committer: These values are integers referring to indexes in the identities array, or null if no such association exists. Use these to tell who authored, committed or signed this particular commit.
  • author_name, author_mail, committer_name, committer_mail: These are the raw names and e-mails from git. Don't use these for identification, they are raw and the identities aren't merged! Use author and committer instead.
  • repo: The path to the repository's local folder. If you want to run further git commands on it, you might need to append /.git to it.
  • hash: The commit's sha-1 hash.
  • author_date: The date that the commit was authored as a Unix timestamp. Note that this is a string of digits, not an integer.
  • committer_date: The date that the commit was committed.
  • subject: The commit message subject line.
  • body: The rest of the commit message.
  • notes: The notes attached to the commit. Basically a message in addition to the regular commit message.
  • signer: The name or e-mail or whatever else the person who signed the commit put here.
  • signer_key: The signature key of who signed the commit.
  • touched_files, insertions, deletions: The amount of modified files, inserted lines and deleted lines in the commit, respectively. Renamed files are taken into account properly, so a rename on its own counts as a single changed file with zero inserted or deleted lines.

See the --find-renames and --find-copies options in git log --help for details.

Structure

idman is the controller that executes all the pieces in lib and pipes them together properly.

parseman gathers all commit information from git and spews out a JSON object for each of them on stdout.

graphman does the identity merging from the commit information it receives from parseman. It can pick from various identity merging algorithms.

assocman receives the results from graphman and the raw commit information from parseman and associates the two, producing the final output. If the algorithm is bad and results in ambiguous associations, assocman will die.

Most of that code has embedded documentation at the bottom of the respective files. You can see it nicely formatted by running perldoc FILE.

Algorithms

default

Like occurrence, but ignores case and strips off .(none) at the end of e-mail addresses, which git seems to randomly attach and remove if the e-mail doesn't contain a dot.

As the name implies, this is the default algorithm.

occurrence

Merges identities if they contain identical artifacts (names or e-mail addresses).

similarity

Like occurrence, but merges identity if their normalized Levenshtein distance is less than a predefined threshold. You must specify a threshold, where 0 < threshold <= 1. You can either do this by passing --threshold NUMBER as a command-line argument or by defining the GRAPHMAN_THRESHOLD environment variable.

This algorithm requires the Text::Levenshtein::XS Perl module. Install it via sudo cpan Text::Levenshtein::XS.

bird

An extension of the similarity algorithm above, the same requirements apply. Implements the algorithm used by Bird et al. in the paper “Mining Email social networks”. This does a whole bunch of pre-processing on the identities and pays attention to the difference between usernames and real first and last names.

Papers