npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

enamdict

v0.1.8

Published

Efficiently query ENAMDICT using Node.js.

Downloads

103

Readme

node-enamdict

A module for efficiently querying name records from ENAMDICT (A Japanese-English mapping of proper names). Specifically this module is designed for the use case of finding a good English/Kana/Kanji mapping for given names and surnames. Finding these mappings can be especially challenging and ENAMDICT appears to have the best available mapping. At this time all other entries in ENAMDICT are ignored (such as place names, full names, company names, etc.).

This utility was created to correct artist names in the romaji-name library, which is used in the Ukiyo-e.org service, created by John Resig. All code is available under an MIT license.

Example Usage:

var enamdict = require("./enamdict");

enamdict.init(function() {
    var entries = enamdict.find("utagawa");
    console.log("Romaji:", entries.romaji());
    console.log("Kana:", entries.kana());
    console.log("Kanji:", entries.kanji());
    console.log("Type:", entries.type());
    
    entries = enamdict.findKanji("曷川")
    console.log("Romaji:", entries.romaji());
    console.log("Kana:", entries.kana());
    console.log("Kanji:", entries.kanji());
    console.log("Type:", entries.type());
});

Sample Output:

# From `find()`
Romaji: Utagawa
Kana: うたがわ
Kanji: [ '哥川', '唄川', '宇多川', '宇田川', '歌川', '詩川', '雅楽川' ]
Type: surname

# From `findKanji()`
Romaji: [ 'katsugawa', 'katsukawa' ]
Kana: [ 'かつがわ', 'かつかわ' ]
Kanji: 曷川
Type: surname

Installation

This package can be installed by running:

npm install enamdict

ENAMDICT Pre-Processing

When this package is installed a copy of ENAMDICT is downloaded from ftp.edrdg.org, if it doesn't already exist. A couple optimizations are performed in order to speed up search time and to decrease the file size of the dictionary.

  • To start, ENAMDICT is converted from a EUC-JP encoding to the more-widely-used UTF-8 encoding.
  • All entries that aren't "surname", "given", "male" (given), "female" (given), or "unknown" are removed.
  • Extraneous non-name details are stripped from the entry (such as the years in which the individual lived).
  • All entries that aren't an individual name part are removed. (e.g. "hiroshige" is kept but "utagawa hiroshige" is removed)
  • Only the "romaji", "kana", "kanji", and "type" fields are preserved, everything else is removed.
  • All the entries are then sorted by their romaji name (to improve lookup performance).

This is all placed into a new enamdict.gz file in the same directory as the enamdict.js script itself. For comparison the old ENAMDICT file is 7.2MB whereas the new one is only 2.8MB.

Methods

.init(callback)

Asynchronously loads the previously-generated reduced ENAMDICT. Must be called before attempting to call .find() or .findKanji().

.find(romajiName)

Finds matching entries by Romaji name (English name). This is the default search mechanism, the search index is optimized for this particular method. Returns an Entries object.

.findKanji(kanjiName)

Finds matching entries by Kanji name (Japanese name). The search index is NOT optimized for this particular method and may be slow. Returns an Entries object.

Models

Entries

The result object returned from the .find() and .findKanji() methods. Holds a collection of entries that are then used in aggregate.

.entries()

Returns an array of objects representing matching entries. The objects have the following properties:

  • romaji: A string holding an English (Romaji) representation of a name.
  • kana: A string holding a Kana representation of a name.
  • kanji: A string holding a Kanji representation of a name.
  • type: A string that represents the type of the name. Possible values are: "surname", "given", or "unknown".

.type()

Returns the most popular type of the name, aggregated from all matching entries. For example if 5 entries were found, three of which were "surname", 1 of which was "given", and 1 of which was "unknown" then this method would return "surname". Returns the same possible values as the type property itself.

.kana()

If a query was done with .find() then this will return a string representing the Kana reading of the name.

If a query was done with .findKanji() then this will return an array of all the possible Kana readings of the Kanji.

.romaji()

If a query was done with .find() then this will return a string representing the Romaji reading of the name.

If a query was done with .findKanji() then this will return an array of all the possible Kana readings of the Kanji.

.kanji()

If a query was done with .find() then this will return an array of all the possible Kanji versions of the name.

If a query was done with .findKanji() then this will return a string representing the Kanji version of the name.