@sohailalam2/markdown-extractor

v0.2.3

Published

3 years ago

Your one solution to extract markdown metadata and content

Downloads

0High
0Medium
0Low

sohailalam2

Markdown Extractor 🔣

📝 Description

Your one stop for parsing Markdown content. Give your markdown superpower by adding metadata.

🔧 Features

Parse markdown content. We use marked
Extract YAML metadata information about your markdown content. We use js-yaml
Convert markdown to HTML and easily extract data from DOM nodes by passing selectors. We use cheerio
Support for both NodeJS and Browser
Browser standalone package also available at dist/bundle.min.js

💡 How does it work?

For example,

---
title: Awesome Markdown Extractor
id: 101
---

# Abstract

Lorem ipsum dolor...

would be parsed as

{
  metadata: {
    title: "Awesome Markdown Extractor",
    id: 101
  },
  content: {
    abstract: 'Abstract'
  },
  html: "<h1>Abstract</h1>...."
}

📝 Prerequisites

NodeJs v12 or above. May work on lower version but not tested.

💻 Installation

$ npm install @sohailalam2/markdown-extractor

✅ Usage

Check example for more information.

Given the sample markdown content, lets see how we can parse it and extract the metadata information

---
title: Backend Engineer
id: 101
locations: [India, Remote]
department: Engineering
publishDate: 2020-06-27T13:53:26.714Z
tags: [NodeJs, AWS, Serverless, TypeScript]
isDraft: true
---

# Backend Engineer

## Abstract

This is the awesome _abstract_ for the **backend engineering** role. Visit https://github.com to checkout our brand and some amazing content.

## Preferred Qualifications

- AWS experience
- Serverless experience

## Perks

- Industry standard salary
- Awesome team
- Freedom and responsibilities

## Other Details

This is an **amazing** opportunity for _budding engineers_. Apply now!!

const fs = require('fs');
const path = require('path');

const { parseMarkdown } = require('../dist');

const markdown = fs.readFileSync(path.join(__dirname, 'job-backend-engineer.md'), 'utf8');

const options = {
  selectors: [
    { selector: '#abstract', parseHtml: true },
    { selector: '#preferred-qualifications' },
    { selector: '#perks', parseHtml: true },
  ],
};

const { metadata, content, html } = parseMarkdown(markdown, options);

// metadata:
//
// {
//   title: 'Backend Engineer',
//   id: 101,
//   locations: [ 'India', 'Remote' ],
//   department: 'Engineering',
//   publishDate: 2020-06-27T13:53:26.714Z,
//   tags: [ 'NodeJs', 'AWS', 'Serverless', 'TypeScript' ],
//   isDraft: true
// }

const abstract = content['#abstract'];
// <p>This is the awesome <em>abstract</em> for the <strong>backend engineering</strong> role. Visit <a href="https://github.com">https://github.com</a> to checkout our brand and some amazing content.</p>

const preferredQualifications = content['#preferred-qualifications'].split('\n');
// [ 'AWS experience', 'Serverless experience' ]

const perks = content['#perks'];

// <ul>
// <li>Industry standard salary</li>
// <li>Awesome team</li>
// <li>Freedom and responsibilities</li>
// </ul>

Standalone library in browser

<script src="../dist/bundle.min.js"></script>

<script>
  // ...

  const { metadata, content, html } = MarkdownExtractor.parseMarkdown(markdown, options);

  // ...
</script>

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme