@vereign/lib-mime
v1.1.9
Published
A utility library for parsing MIME with additional vendor-specific features (Outlook/Gmail)
Downloads
2,492
Readme
MIME parser
Table of contents
[[TOC]]
Prerequisites
Node.JS v12
yarn
package manager
Installation
- Set up authentication to the company NPM account
$ yarn add @vereign/lib-mime
or$ npm install @vereign/lib-mime
Usage
Basic sample
import MIMEParser from "@veriegn/mime-parser";
const mimeParser = new MIMEParser(mimeString);
const html = await mimeParser.getHTML();
const plain = await mimeParser.getPlain();
const attachments = mimeParser.getAttachments();
const from = mimeParser.getGlobalHeaderValue("from");
const to = mimeParser.getGlobalHeaderValue("to");
const cc = mimeParser.getGlobalHeaderValue("cc");
const bcc = mimeParser.getGlobalHeaderValue("bcc");
const subject = mimeParser.getGlobalHeaderValue("subject");
Custom DOM parser. Provided example uses @vereign/dom
import MIMEParser from "@veriegn/mime-parser";
import { DOM } from "@vereign/dom";
const mimeParser = new MIMEParser(mimeString);
mimeParser.parseHTML = (htmlString: string) => {
return new DOM(htmlString).window.document;
};
Separation of the replied/forwarded fragments from the HTML part of the MIME body
Gmail
Gmail wraps forwarded/replied parts into <div class="gmail_quote/>
.
The goal is to extract top-level ones.
HTML sample:
<div dir="ltr">
Hello!
<br />
<!-- Consider everything within as the quoted fragment -->
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">
---------- Forwarded message ---------<br />
From:
<strong class="gmail_sendername" dir="auto">Sender Name</strong>
<span dir="auto">
<a href="mailto:[email protected]">[email protected]</a>
</span>
<br />
Date: Tue, Feb 9, 2021 at 6:29 PM<br />
Subject: Fwd: Gmail-Gmail-Forward<br />
To: Recipient Name
<a href="mailto:[email protected]">[email protected]</a>
</div>
</div>
</div>
Outlook
Outlook.com
outlook.com engine uses an empty <div/>
element with id appendonsend
as a separator.
_Sometimes it might be prefixed with x_
as a part of their Anti-XSS protection_
Everything below this separator can be considered as forwarded/replied part
HTML sample
<body>
<div>
<span>Hello!</span>
<div style="margin: 0px; background-color: rgb(255, 255, 255)">and</div>
</div>
<div>
<!-- Consider everything below as the quoted fragment -->
<div id="appendonsend"></div>
<hr tabindex="-1" style="display: inline-block; width: 98%" />
<div id="divRplyFwdMsg" dir="ltr">
<font face="Calibri, sans-serif" color="#000000" style="font-size: 11pt"
><b>From:</b> Igor Markin <[email protected]><br /><b
>Sent:</b
>
Tuesday, February 9, 2021 6:32 PM<br /><b>To:</b> Stephan Morphis
<[email protected]><br /><b>Subject:</b> Fw:
Outlook-Outlook-Forwarded</font
>
<div> </div>
</div>
</div>
</body>
Outlook for Office 365, version 16.48 (Build 21041102) [MacOS]
This version does not provide an explicit identifier of the forwarded/replied content. However, it has a specific structure and style. Algorithm looks for the first occurrence of the structure provided below, and considers all consequent siblings as belonging to the fwd/repl fragment.
<div
style="border: none; border-top: solid #b5c4df 1pt; padding: 3pt 0cm 0cm 0cm;"
/>
Outlook Desktop allows end user completely alter contents and style of the fwd/rpl parts, which can completely invalidate it for MIME parser. In such case, library will not be able to recognize it.
HTML sample
<body lang="en-RU" link="#0563C1" vlink="#954F72" style="word-wrap: break-word">
<div class="WordSection1">
<p class="MsoNormal">
<span lang="EN-US">Hello!<o:p></o:p></span>
</p>
<p class="MsoNormal"><o:p> </o:p></p>
<!-- Consider everything below as the quoted fragment -->
<div
style="border: none; border-top: solid #b5c4df 1pt; padding: 3pt 0cm 0cm 0cm;"
>
<p class="MsoNormal">
<b><span style="font-size: 12pt; color: black">From: </span></b
><span style="font-size: 12pt; color: black"
>Stephan Morphis <[email protected]><br /><b>Date: </b
>Monday, 26 April 2021, 18:31<br /><b>To: </b>Stephan Morphis
<[email protected]><br /><b>Subject: </b>Re:
macos-outlook-outlook-direct</span
><span
style="font-size: 12pt; color: black; mso-fareast-language: EN-GB"
><o:p></o:p
></span>
</p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal"><span lang="EN-US">Back to you!</span><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</body>
Outlook for Office 365, version 2103 (Build 13901.20462) [Windows]
Same to MacOS, this version does not provide an explicit identifier of the forwarded/replied content.
It has a similar specific structure and style with a slight differences.
Algorithm looks for the first occurrence of the structure provided below, and considers all consequent siblings
as belonging to the fwd/repl fragment.
<div>
<div
style="border: none; border-top: solid #e1e1e1 1pt; padding: 3pt 0in 0in 0in;"
>
...
</div>
</div>
Outlook Desktop allows end user completely alter contents and style of the fwd/rpl parts, which can completely invalidate it for MIME parser. In such case, library will not be able to recognize it.
HTML sample
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap: break-word">
<div class="WordSection1">
<p class="MsoNormal">Hello!<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<!-- Consider everything below as the quoted fragment -->
<div>
<div
style="border: none; border-top: solid #e1e1e1 1pt; padding: 3pt 0in 0in 0in;"
>
<p class="MsoNormal">
<b>From:</b> Rosen Georgiev <[email protected]> <br />
<b>Sent:</b> Tuesday, April 27, 2021 10:46 AM<br />
<b>To:</b> Rosen Georgiev <[email protected]><br />
<b>Subject:</b> FW: 2 forwards simple text<o:p></o:p>
</p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Forward 1<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>