to-pure-markdown
v3.0.1
Published
Turn HTML into text formatted with Markdown -- no HTML tags allowed in the output.
Downloads
3
Maintainers
Readme
to-pure-markdown
An HTML to Markdown converter written in JavaScript.
The API is as follows:
toMarkdown(stringOfHTML, options);
Note to-markdown v2 runs on Node 4+. For a version compatible with Node 0.10 - 0.12, please use to-markdown v1.x.
Installation
Browser
Download the compiled script located at dist/to-pure-markdown.js
.
<script src="PATH/TO/to-pure-markdown.js"></script>
<script>toMarkdown('<h1>Hello world!</h1>')</script>
Node.js
Install the to-pure-markdown
module:
$ npm install to-pure-markdown
Then you can use it like below:
var toMarkdown = require('to-pure-markdown');
toMarkdown('<h1>Hello world!</h1>');
(Note it is no longer necessary to call .toMarkdown
on the required module as of v1.)
Options
converters
(array)
to-pure-markdown can be extended by passing in an array of converters to the options object:
toMarkdown(stringOfHTML, { converters: [converter1, converter2, …] });
A converter object consists of a filter, and a replacement. This example from the source replaces code
elements:
{
filter: 'code',
replacement: function(content) {
return '`' + content + '`';
}
}
filter
(string|array|function)
The filter property determines whether or not an element should be replaced. DOM nodes can be selected simply by filtering by tag name, with strings or with arrays of strings:
filter: 'p'
will selectp
elementsfilter: ['em', 'i']
will selectem
ori
elements
Alternatively, the filter can be a function that returns a boolean depending on whether a given node should be replaced. The function is passed a DOM node as its only argument. For example, the following will match any span
element with an italic
font style:
filter: function (node) {
return node.nodeName === 'SPAN' && /italic/i.test(node.style.fontStyle);
}
replacement
(function)
The replacement function determines how an element should be converted. It should return the markdown string for a given node. The function is passed the node’s content, as well as the node itself (used in more complex conversions). It is called in the context of toMarkdown
, and therefore has access to the methods detailed below.
The following converter replaces heading elements (h1
-h6
):
{
filter: ['h1', 'h2', 'h3', 'h4', 'h5', 'h6'],
replacement: function(innerHTML, node) {
var hLevel = node.tagName.charAt(1);
var hPrefix = '';
for(var i = 0; i < hLevel; i++) {
hPrefix += '#';
}
return '\n' + hPrefix + ' ' + innerHTML + '\n\n';
}
}
gfm
(boolean)
to-pure-markdown has beta support for GitHub flavored markdown (GFM). Set the gfm
option to true:
toMarkdown('<del>Hello world!</del>', { gfm: true });
Methods
The following methods can be called on the toMarkdown
object.
isBlock(node)
Returns true
/false
depending on whether the element is block level.
isVoid(node)
Returns true
/false
depending on whether the element is void.
trim(string)
Returns the string with leading and trailing whitespace removed.
outer(node)
Returns the content of the node along with the element itself.
Development
First make sure you have node.js/npm installed, then:
$ npm install --dev
Automatically browserify the module when source files change by running:
$ npm start
Tests
To run the tests in the browser, open test/index.html
.
To run in node.js:
$ npm test
Credits
This is a fork of the excellent to-markdown by domchriste.