mini-censor

v1.0.7

Published

5 months ago

敏感词过滤，支持自定义敏感词内容, 基于Aho–Corasick算法，

Downloads

0High
0Medium
0Low

harrypoint

mini-censor censor sensitive word filter sensitive filter

mini-censor

🎇 说明

基于 Aho–Corasick 算法实现的敏感词过滤方案，Aho–Corasick 算法是由 Alfred V. Aho 和 Margaret J.Corasick 发明的字符串搜索算法，用于在输入的一串字符串中匹配有限组“字典”中的子串。它与普通字符串匹配的不同点在于同时与所有字典串进行匹配。算法均摊情况下具有近似于线性的时间复杂度，约为字符串的长度加所有匹配的数量。

English | 简体中文

💪 支持平台

本插件支持 Node 及浏览器平台

性能

使用 20000 个随机敏感词实例化的平均时间：< 96ms

测试字符串包含随机生成的汉字、字母、数字。以下测试均在 20000 个随机敏感词构建的树下进行测试，每组测试 6 次取平均值：

| 编号 | 字符串长度 | 不替换敏感词[replace:false] | 替换敏感词 | | :--: | :--------: | :-------------------------: | :--------: | | 1 | 1000 | < 1.35ms | < 1.55ms | | 2 | 5000 | < 3.60ms | < 3.60ms | | 3 | 10000 | < 8.10ms | < 9.81ms | | 4 | 20000 | < 15.03ms | < 16.03ms | | 5 | 50000 | < 20.83ms | < 21.18ms | | 6 | 100000 | < 29.02ms | < 34.45ms |

需要注意的是，实际生产环境运行速度会比上面测试数据更快。

📦 安装

npm i -S mini-censor

或

yarn add mini-censor

🎉 使用

CommonJS 引用

const Censor = require("mini-censor").default;
const censor = new Censor(["敏感词数组"]);

TypeScript / ES Module 引用

import Censor from "mini-censor";
const censor = new Censor(["敏感词数组"]);

方法

filter(text, options)

类型如下

  filter(text: string, options?: {
      replace: boolean;
      replaceWidth?: string;
  }): {
      text: string;
      words: string[];
      pass: boolean;
  };

该方法将返回过滤文本和被过滤的敏感词。

import Censor from "mini-censor";
const censor = new Censor(["敏感词", "数组"]);

censor.filter("这是一个敏感词字符串");
/**
 * {
 *   text: "这是一个***字符串",
 *   words: ["敏感词"];
 *   pass: false;
 * }
 */
censor.filter("这是一个敏感词字符串", { replaceWidth: "😊" });
/**
 * {
 *   text: "这是一个😊😊😊字符串",
 *   words: ["敏感词"];
 *   pass: false;
 * }
 */

censor.filter("这是一个敏感词字符串", { replace: false });
/**
 * {
 *   text:  "这是一个敏感词字符串",
 *   words: ["敏感词"];
 *   pass: false;
 * }
 */

LICENSE

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

mini-censor

🎇 说明

💪 支持平台

性能

📦 安装

🎉 使用

CommonJS 引用

TypeScript / ES Module 引用

方法

filter(text, options)

LICENSE