decancer
v3.2.8
Published
A library that removes common unicode confusables/homoglyphs from strings.
Downloads
458
Maintainers
Readme
decancer
A library that removes common unicode confusables/homoglyphs from strings.
- Its core is written in Rust and utilizes a form of Binary Search to ensure speed!
- By default, it's capable of filtering 221,529 (19.88%) different unicode codepoints like:
- All whitespace characters
- All diacritics, this also eliminates all forms of Zalgo text
- Most leetspeak characters
- Most homoglyphs
- Several emojis
- Unlike other packages, this package is unicode bidi-aware where it also interprets right-to-left characters in the same way as it were to be rendered by an application!
- Its behavior is also highly customizable to your liking!
Installation
In your shell:
npm install decancer
In your code (CommonJS):
const decancer = require('decancer')
In your code (ESM):
import decancer from 'decancer'
Examples
const assert = require('assert')
const cured = decancer('vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣 wWiIiIIttHh l133t5p3/-\\|<')
assert(cured.equals('very funny text with leetspeak'))
// WARNING: it's NOT recommended to coerce this output to a JavaScript string
// and process it manually from there, as decancer has its own
// custom comparison measures, including leetspeak matching!
assert(cured.toString() !== 'very funny text with leetspeak')
console.log(cured.toString())
// => very funny text wwiiiiitthh l133t5p3/-\|<
assert(cured.contains('funny'))
cured.censor('funny', '*')
console.log(cured.toString())
// => very ***** text wwiiiiitthh l133t5p3/-\|<
cured.censorMultiple(['very', 'text'], '-')
console.log(cured.toString())
// => ---- ***** ---- wwiiiiitthh l133t5p3/-\|<
Donations
If you want to support my eyes for manually looking at thousands of unicode characters, consider donating! ❤