obscenity

v0.4.3

Published

2 months ago

Robust, extensible profanity filter.

Downloads

157,684

0High
0Medium
0Low

jo3-l

profanity profane obscenities obscenity obscene filter curse swear swearing vulgar vulgarity bad-words badwords cuss cussing

Obscenity

Robust, extensible profanity filter for NodeJS.

Why Obscenity?

Accurate: Though Obscenity is far from perfect (as with all profanity filters), it makes reducing false positives as simple as possible: adding whitelisted phrases is as easy as adding a new string to an array, and using word boundaries is equally simple.
Robust: Obscenity's transformer-based design allows it to match on variants of phrases other libraries are typically unable to, e.g. fuuuuuuuckkk, ʃṳ𝒸𝗄, wordsbeforefuckandafter and so on. There's no need to manually write out all the variants either: just adding the pattern fuck will match all of the cases above by default.
Extensible: With Obscenity, you aren't locked into anything - removing phrases that you don't agree with from the default set of words is trivial, as is disabling any transformations you don't like (perhaps you feel that leet-speak decoding is too error-prone for you).

Installation

$ npm install obscenity
$ yarn add obscenity
$ pnpm add obscenity

Example usage

First, import Obscenity:

const {
	RegExpMatcher,
	TextCensor,
	englishDataset,
	englishRecommendedTransformers,
} = require('obscenity');

Or, in TypeScript/ESM:

import {
	RegExpMatcher,
	TextCensor,
	englishDataset,
	englishRecommendedTransformers,
} from 'obscenity';

Now, we can create a new matcher using the English preset.

const matcher = new RegExpMatcher({
	...englishDataset.build(),
	...englishRecommendedTransformers,
});

Now, we can use our matcher to search for profanities in the text. Here's two examples of what you can do:

Check if there are any matches in some text:

if (matcher.hasMatch('fuck you')) {
	console.log('The input text contains profanities.');
}
// The input text contains profanities.

Output the positions of all matches along with the original word used:

// Pass "true" as the "sorted" parameter so the matches are sorted by their position.
const matches = matcher.getAllMatches('ʃ𝐟ʃὗƈｋ ỹоứ 𝔟ⁱẗ𝙘ɦ', true);
for (const match of matches) {
	const { phraseMetadata, startIndex, endIndex } =
		englishDataset.getPayloadWithPhraseMetadata(match);
	console.log(
		`Match for word ${phraseMetadata.originalWord} found between ${startIndex} and ${endIndex}.`,
	);
}
// Match for word fuck found between 0 and 6.
// Match for word bitch found between 12 and 18.

Censoring matched text:

To censor text, we'll need to import another class: the TextCensor. Some other imports and creation of the matcher have been elided for simplicity.

const { TextCensor, ... } = require('obscenity');
// ...
const censor = new TextCensor();
const input = 'fuck you little bitch';
const matches = matcher.getAllMatches(input);
console.log(censor.applyTo(input, matches));
// %@$% you little **%@%

This is just a small slice of what Obscenity can do: for more, check out the documentation.

Accuracy

Note: As with all swear filters, Obscenity is not perfect (nor will it ever be). Use its output as a heuristic, and not as the sole judge of whether some content is appropriate or not.

With the English preset, Obscenity (correctly) finds matches in all of the following texts:

you are a little fucker
fk you
ffuk you
i like a$$es

...and it does not match on the following:

the pen is mightier than the sword
i love bananas so yeah
this song seems really banal
grapes are really yummy

Documentation

For a step-by-step guide on how to use Obscenity, check out the guide.

Otherwise, refer to the auto-generated API documentation.

Contributing

Issues can be reported using the issue tracker. If you'd like to submit a pull request, please read the contribution guide first.

Author

GitHub @jo3-l

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme