utf8-decoder
v1.3.0
Published
<!-- automd:badges color=yellow -->
Downloads
14
Readme
utf8-decoder
A simple UTF-8 decoder, keeping align with native TextDecoder as much as possible.
Most part of the code is modified from V8's Utf8DfaDecoder, which is a modified version of Flexible and Economical UTF-8 Decoder.
- Handle malformed UTF-8 better than most solutions.
- Produce same output as the Node's TextDecoder.
- Keep default char unicode
�
as-is instead of throwing error. - Process surrogate pairs correctly for Emojis.
- Designed with performance in mind. Provide comparable performance to native TextDecoder.
You can try over the test case for other UTF-8 decoders to see the difference, especially the malformed cases.
Usage
Install package:
# ✨ Auto-detect
npx nypm install utf8-decoder
# npm
npm install utf8-decoder
# yarn
yarn add utf8-decoder
# pnpm
pnpm install utf8-decoder
# bun
bun install utf8-decoder
Import:
ESM (Node.js, Bun)
import { decode } from "utf8-decoder";
CommonJS (Legacy Node.js)
const { decode } = require("utf8-decoder");
CDN (Deno, Bun and Browsers)
import { decode } from "https://esm.sh/utf8-decoder";
Development
- Clone this repository
- Install latest LTS version of Node.js
- Enable Corepack using
corepack enable
- Install dependencies using
pnpm install
- Run interactive tests using
pnpm dev
Reference
Unicode/UTF-8 is a complex topic, here are some references for further reading:
I recommend reading the chapter 3 of the Unicode Standard for a better understanding of the encoding and the invalid sequences and error handling.
This project is highly inspired by the following projects:
License
Published under the MIT license. Made by community 💛
🤖 auto updated with automd