npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

yitizi

v0.1.2

Published

Input a Chinese character. Output all the variant characters of it.

Downloads

76

Readme

Yitizi

Input a Chinese character. Output all the variant characters of it. 輸入一個漢字,輸出它的全部異體字。 输入一个汉字,输出它的全部异体字。

Usage

Python

pip install yitizi
>>> import yitizi
>>> yitizi.get('和')
['咊', '龢']

JavaScript (Node.js)

npm install yitizi
> const Yitizi = require('yitizi');
> Yitizi.get('和');
[ '咊', '龢' ]

JavaScript (browser)

<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
> Yitizi.get('和');
[ '咊', '龢' ]

Design

Connections between variant characters can be modeled as an graph with characters as vertices, where two characters are variants of each other if they are directly connected by an edge.

To reduce data redundancy, only several types of basic connections are stored in data tables located in data/, from which the full graph yitizi.json is computed by invoking build/main.py.

Basic connections

A basic connection between two variant characters can be classified into one of the three types: equivalent, intersecting, simplification.

  • Equivalent "全等": Two characters are equivalent only if they are interchangable in most texts without change in the meaning. When computing the full graph, it is considered both commutative and transitive, i.e.

    • If A is an equivalent variant of B, then B is an equivalent variant of A;
    • If A is an equivalent variant of B, and B is an equivalent variant of C, then A is an equivalent variant of C.
  • Intersecting "語義交疊": Two characters are intersecting variants if they are interchangable in certain cases. It is also commutative, but not necessarily transitive. Characters with intersecting variants are arranged in groups (rows in data files), with each group having specific meanings shared by its listed characters. A character can belong to multiple groups.

    Example: "閒" has two intersecting variants: "閑" and "間", listed in two groups:

    閒閑  # meaning "vacant"
    閒間  # meaning "in the middle"
    閑>闲  # simplified form (same below)
    間>间

    Then in the computed yitizi.json:

    • 閒 and 閑 (闲) are variants of each other;
    • 閒 and 間 (间) are variants of each other;
    • 閑 (闲) and 間 (间) are unrelated.

    Example I-1

    A more complex (though abstract) example:

    =AB  # "=" means equivalent variants
    ACD
    AEFG
    • A, B, C and D are variants of one another;
    • A, B, E, F and G are variants of one another;
    • No connections between C (or D) and E (or F/G).

    Example I-2

  • Simplification "簡體": A non-transitive and asymmetric connection. A simplified character is associated only with its traditional form.

    Example 1: "么" is 1) a simplified form of "麼", 2) an equivalent variant of "幺"; "麼" has an equivalent variant "麽", then:

    • 麼, 麽 and 么 are variants of one another;
    • 幺 and 么 are variants of each other;
    • 麼 or 麽 is unrelated to 幺.

    Example S-1

    Example 2: "苧" is 1) a simplified form of "薴", 2) a traditional form of "苎", then:

    • 苧 is a variant of 薴 and 苎;
    • 薴 and 苎 are unrelated.

    Example S-2

    Example 3: "芸" is a simplified form of "藝" (Japanese Shinjitai) and "蕓" (Chinese), and "艺" is also a simplified form of "藝" (Chinese), then:

    • 藝, 芸 and 艺 are variants of one another;
    • 蕓 and 芸 are variants of each other;
    • 藝 or 艺 is unrelated to 蕓.

    Example S-3

Data source

Note for developers

You need to substitute all the occurrences of the version string before publishing a new release.