npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@iceylan/huffmanjs

v0.0.2

Published

A javascript library that compress and uncompress given string using Huffman algorithm.

Downloads

7

Readme

HuffmanJS

This is a port of the Huffman algorithm in pure JavaScript.

It can compress given string and put huffman code into it. It can also decompress original data using just the compressed binary.

If we want to export huffman codes, it can also do this. So we can store it on the cloud and then compressed part can circulate around with the light form of it.

This library can work on browsers too. So the sky is the limit of the possibilities.

Installation

npm install @iceylan/huffmanjs

Usage

Compression

import { compress } from "@iceylan/huffmanjs";

const [ compressedBinary, huffmanCodes ] = compress( "Hello world!" );

Explanation of the Binary Data

compressedBinary holds the compressed binary data.

íce9$ !Hdelo
rwíSºí
Magic Bytes

The first 4 bytes íce9 of the binary data is the magic bytes of the library which we can easily change to any length of something else we want from lib/constants.js file.

Zero Padding

Next byte indicates the length of the zero padding that added to the encoded data's latest byte to complete it to 8 bits. It was a number from 0 to 7 and we used it as a decimal character code and put there its corresponding character. So, we can easily decode it with "".charCodeAt(0) method.

Meta Datas

Next 2 bytes are encoded form of the length of the serialized (encoded) huffman codes. 16 bit can hold 1 to 65536 numbers. That means as long as the huffman table does not exceed the 64 kilobyte limit, there is no problem.

Huffman Codes

It uses fixed length byte allocation technique to encode the Huffman codes. Every {decimal:binary string} pair lives in 32 bit capacity.

It converts all the character that used in the original data to the decimal numbers. So multibyte characters like emojis or chinese characters are also supported.

It allocates 16 bit for the decimal representation of the character which it can be 0 to 65535. It encodes them into binary form and adds the missing bits at the start as zeros and convert them into characters. So first 2 byte will give us a char code.

After that, it allocates 16 bit for the huffman code that calculated for the character. Huffman codes mostly uses 2-13 bits which 16 was fair enough. It also adds missed bits at the start as zeros to complete it to 16 bits. After that, it converts them into characters.

Compressed Data

And the latest 5 bytes íSºí represents the compressed form of Hello world!. To decode it, we need to convert characters to decimal numbers. Then we need to convert decimal numbers to binary form. Then we need to remove the padding zeros. And finally, we can start search the Huffman codes in that bit stack. If we find the huffman code, we can replace it with the character. At the end, we would get the original data.

Serialization and Deserialization of the Huffman Codes

Huffman codes is a map to convert characters into ones and the zeros as string and backwards. After the library calculates the huffman code for all the characters in the original data, it converts characters into short huffman codes. Then it divide them into groups of 8 and reencode them into decimals and the decimals into characters. The easiest way to get the original data from the compressed data is use this map.

Calculated Huffman codes will look like this:

{
	108: "110",
	101: "1111",
	111: "1010",
	114: "1110",
	119: "1011",
	32: "10011",
	33: "10010",
	72: "10001",
	100: "10000"
}

Even if we remove the whitespaces it could use 120 bytes. But the library provides Huffman code serialization and deserialization ability. If we want to store the code table in the somewhere else rather than the binary data, we can serialize it to save some space.

import { serializeHuffmanCode } from "@iceylan/huffmanjs";

const serialized = serializeHuffmanCode( huffmanCodes );
 !Hdelo
rw

and the length will be (9 entry x 32 bits = 288 bits = 36 bytes).

We can deserialize the code table and use the result to pass decompress method to decompress the headless compressed binary data.

import { unserializeHuffmanCode } from "@iceylan/huffmanjs";

const unserialized = unserializeHuffmanCode( serialized );

Result will be an object that looks the same as the original huffman code and decompress method will accept it to decompress the binary data.

Decompression

We can easily decompress the compressed binary data.

Standalone Decompression

With this scenario, we can decompress the compressed binary data without huffman code map. The binary has it's own Huffman code in it so it can decompress itself.

import { decompress } from "@iceylan/huffmanjs";

const decompressed = decompress( compressedBinary );
// Hello world!

Decompression with External Code Table

With this scenario, we can decompress the compressed binary data with external code table. The binary will be allowed to contain its own Huffman code in it but this time, externally given Huffman code map will be used to decompress the data. If it fails then it will warn and try to decompress the data with the Huffman code that is stored in the binary. If the binary doesn't have it then it will throw an error.

import { decompress } from "@iceylan/huffmanjs";

const decompressed = decompress( compressedBinary, unserialized );
// Hello world!