wordsoap-regex
v0.1.1
Published
Regular expressions for cleaning up dirty HTML output from Microsoft Word.
Downloads
9
Maintainers
Readme
wordsoap-regex
Regular expressions for cleaning up dirty HTML output from Microsoft Word.
module.exports = {
// from http://tim.mackey.ie/CleanWordHTMLUsingRegularExpressions.aspx
msoTags: /<[\/]?(font|span|xml|del|ins|[ovwxp]:\w+)[^>]*?>/,
msoAttributes: /<([^>]*)(?:class|lang|style|size|face|[ovwxp]:\w+)=(?:'[^']*'|""[^""]*""|[^\s>]+)([^>]*)>/,
}
License
ISC © Raine Lourie