@dmitryrechkin/text-email-body-parser
v1.0.3
Published
**Text Email Body Parser is a TypeScript library inspired by Email Reply Parser to parse plain-text email bodies and extract meaningful fragments.**
Downloads
361
Readme
Text Email Body Parser
Text Email Body Parser is a TypeScript library inspired by Email Reply Parser to parse plain-text email bodies and extract meaningful fragments.
This library supports most email replies, signatures, and locales.
Author: Dmitry Rechkin [email protected]
Installation
Install the project using pnpm:
pnpm add @dmitryrechkin/text-email-body-parser
Features
- Strip email replies like On DATE, NAME wrote:
- Supports around 10 locales, including English, French, Spanish, Portuguese, Italian, Japanese, Chinese.
- Removes signatures like Sent from my iPhone
- Removes signatures like Best wishes
Usage
import { TextEmailBodyParser } from "@dmitryrechkin/text-email-body-parser";
const parser = new TextEmailBodyParser();
const email = parser.parse(MY_EMAIL_STRING);
console.log(email.map(fragment => fragment.text).join('\n'));
Defining Custom Patterns
The TextEmailBodyParser class allows you to define custom patterns for headers and signatures. These patterns can be passed to the constructor using the TextEmailBodyPatterns interface.
export interface TextEmailBodyPatterns {
readonly HEADER_REGEX: RegExp[];
readonly SIGNATURE_REGEX: RegExp[];
}
Example of Custom Patterns
import { TextEmailBodyParser, TextEmailBodyPatterns } from "text-email-body-parser";
const customPatterns: TextEmailBodyPatterns = {
HEADER_REGEX: [
// Custom header patterns
/CustomHeaderPattern/,
],
SIGNATURE_REGEX: [
// Custom signature patterns
/CustomSignaturePattern/,
]
};
const parser = new TextEmailBodyParser(customPatterns);
const email = parser.parse(MY_EMAIL_STRING);
console.log(email.map(fragment => fragment.text).join('\n'));
Fragment Structure
The parsed email is broken down into fragments, each represented by the TypeTextEmailBodyFragment interface.
TypeTextEmailBodyFragment Interface
export interface TypeTextEmailBodyFragment {
headerText?: string; // The header text, if any, associated with this fragment
text: string; // The main content of the fragment
depth: number; // The quote depth of the fragment, indicating how deeply nested the quote is
signature: boolean; // Indicates whether the fragment is recognized as a signature
}
Example of a Parsed Fragment
const parser = new TextEmailBodyParser();
const email = parser.parse(MY_EMAIL_STRING);
email.forEach(fragment => {
console.log(`Header: ${fragment.headerText}`);
console.log(`Text: ${fragment.text}`);
console.log(`Depth: ${fragment.depth}`);
console.log(`Is Signature: ${fragment.signature}`);
});
Inspired By
It has been inspired by Email Reply Parser, which requires RE2 and does not work in nodeless environments like Cloudflare.
Text Email Body Parser has been completely rewritten in TypeScript, retaining only the regex patterns and the core idea of fragments from the original. It has been thoroughly tested, and all unit tests pass.
Contributing
Feel free to fork this project and submit fixes. We may adapt your code to fit the codebase.
You can run unit tests using:
pnpm test