regexator
v1.1.2
Published
String to a regex that is latin script and diacritic insensitive
Downloads
531
Maintainers
Readme
Regexator
Creates the inverse of transliterated string to a regex. What? Basically, a regex that is diacritic insensitive
Why?
Sometimes you are looking for déja vu, but your database is dumb and doesn't understand collations and diacritic insensitiveness, but it can compare stuff using regex, so there ya go.
How?
Suppose you have the word résumé but written improperly in the database as resume. The user is clever, and types it correctly into the search box. Gets nothing. How to search for all the weird cases people mistype stuff when comes to accents? In the same way, you're looking for Charles de Gaulle or Dont'a Hightower but you can't remember where the spaces are. You'll be able to find them even if you're looking for Charles degaulle or Donta Hightower
The 'i' flag is enabled globally, unless flags are set manually or caseinsensitive option is set to false
import { stringToRegex } from 'regexator';
stringToRegex()('résumé'); // => /r[eEÉéÈèÊêëË]s[úùÚÙüÜuU]m[eEÉéÈèÊêëË]/i;
Options
options.strong
Type: Boolean
Default: undefined
Converts all characters, including consonants with extended mappings
stringToRegex({ strong: true })('résumé');
// => /[RrŔŕŖŗŘřȐȑȒȓṘṙṚṛṜṝṞṟ][EeÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ][SsŚśŜŝŞşŠšȘșṠṡṢṣṤṥṦṧṨṩ][UuÙùÚúÛûÜüŨũŪūŬŭŮůŰűŲųǓǔǕǖǗǘǙǚǛǜȔȕȖȗṲṳṴṵṶṷṸṹṺṻỤụỦủỨứỪừỬửỮữỰự][MmḾḿṀṁṂṃ][EeÈèÉéÊêËëĒēĔĕĖėĘęĚěȄȅȆȇȨȩḔḕḖḗḘḙḚḛḜḝẸẹẺẻẼẽẾếỀềỂểỄễỆệ]/i
options.spaces
Type: Boolean
Default: undefined
includes a space or a dash(-) or a single quotation mark (') between each characters
stringToRegex({ spaces: true })('résumé');
// => /r(?:\s|'|-)*[EeÈèÉéÊêËë](?:\s|'|-)*s(?:\s|'|-)*[UuÙùÚúÛûÜü](?:\s|'|-)*m(?:\s|'|-)*[EeÈèÉéÊêËë]/i
options.mappings
Type: String
resets the mappings
stringToRegex({
mappings: {
e: 'eéÉ',
},
})('résumé'); // => /r[eéÉ]s[úùÚÙüÜuU]m[eéÉ]/i;
If you want to change the mappings for all instances:
import { charCodes } from 'regexator';
mappings['*'] = ['[\\S\\s]+'];
options.flags
Type: String
Default: i
resets the flags
stringToRegex({ flags: 'mu' })('résumé');
// => /r[eEÉéÈèÊêëË]s[úùÚÙüÜuU]m[eEÉéÈèÊêëË]/mu;
options.caseinsensitive
Type: Boolean
Default: true
if false, disables the i flag
stringToRegex({ global: true })('résumé');
// => /r[eEÉéÈèÊêëË]s[úùÚÙüÜuU]m[eEÉéÈèÊêëË]/;
options.multiline
Type: Boolean
Default: undefined
enables the m flag
stringToRegex({ multiline: true })('résumé');
// => /r[eEÉéÈèÊêëË]s[úùÚÙüÜuU]m[eEÉéÈèÊêëË]/mi;
options.unicode
Type: Boolean
Default: undefined
enables the u flag
stringToRegex({ unicode: true })('résumé');
// => /r[eEÉéÈèÊêëË]s[úùÚÙüÜuU]m[eEÉéÈèÊêëË]/iu;
Caveats
Be aware of RegExp.prototype.exec with g
flag being stateful
The i
flag is appended to the RegExp flags if you don't pass any flags to stringToRegex
Compatibility
Work in node and the browser, but needs polyfills for Array.reduce
, Array.map
and Object.keys
depending on how old your target browser is
License
MIT