tipograph
v0.7.4
Published
A little javascript library and command line tool that makes your written content more typographically correct.
Downloads
4,907
Maintainers
Readme
tipograph
A little javascript library and command line tool that makes your written content more typographically correct.
STATUS: The library is in passive maintenance. I don’t have any active use of this project personally. Nevertheless, all feature requests and bug reports will be addressed in a reasonable time manner.
“When you ignore typography, you’re ignoring an opportunity to improve the effectiveness of your writing.” – Matthew Butterick
Even if typography can be seen as a set of rules given by some freaks, it’s actually quite an important aspect of written content. Besides it brings an aesthetic value, it also helps a person to read the text more fluently and comfortably. And curly quotes just look great!
However, to be typographically correct one has to make some non-trivial effort, be it to learn the rules or to find out how to type all those special characters instead of these present on his keyboard. And therefore tipograph comes here to help. It tries its best to fix a text and apply the rules.
It’s impossible to manage all rules out there, because tipograph is just a set of simple transformation rules and it doesn’t understand wider linguistic context. And sometimes it will fail. But still, the help deserves to be appreciated. Especially when it costs nothing.
In version 0.4.0 there are API breaking changes as it’s a complete rewrite. However, the migration should not be difficult (see the guide). If you are interested, here is the documentation for the old API.
Tipograph is not in stable phase yet. Rules will be added and improved over time. Feel free to make suggestion or ask question if you have any.
Note that Tipograph is focused on character substitution text-wise. Therefore it has a different goal than Typeset library which focuses on nice typography regarding appearance (although there is a small overlap in some pattern substitution).
Demo
You can see what tipograph help you with here.
Installation
In node
# to use it as library
npm install --save tipograph
# to use it as command line utility
npm install --global tipograph
In browser
<script type="text/javascript" src="https://unpkg.com/tipograph"></script>
Usage
// in browser, tipograph is accessible as property of window
var tipograph = require('tipograph');
// initialize new instance
var typo1 = tipograph();
// initialize new instance with different configuration
var typo2 = tipograph({
format: 'html',
language: 'czech',
presets: ['quotes', 'language'],
post: 'latex',
options: {
dash: 'em',
},
});
typo2('"Ahoj <b style="color: red;">světe</b>!"') // „Ahoj <b style="color: red;">světe</b>!“
// stream support (only in node)
var fs = require('fs');
fs.createReadStream('input.txt')
.pipe(tipograph.createStream(/*{ options }*/))
.pipe(fs.createWriteStream('output.txt'));
CLI
Tipograph also provides command line interface. You just need to install the package globally.
Basic usage
tipograph -i input.txt -o output.txt
Help
tipograph --help
Note that writing the transformed content into the source file itself results in an empty file. Moreover, you should always check the output whether it’s correct and make a backup of a content if you want to write into the file back.
Presets
There is a number of predefined rules which are grouped into presets. By default, all these presets are used, although you can pick just those you want by passing an array into options object. If you want to apply your own custom rules, you can pass your preset into the array (see preset documentation for more details). Note that the order in presets array determines the order of rules application onto the input.
Rules mentioned here don’t cover all typography rules, just those which are handled by tipograph. Please, read some other resources in order to be able to make your content better.
Description here is quite a general overview. You can see a lot of examples how these presets behave here.
hyphens
Hyphens are present on our keyboards and are used mostly to separate multipart words (“cost-effective”) or multiword phrases which need to be together (“high-school grades”). Dashes come in two sizes: en dash and em dash. En dash is used instead of hyphen in number ranges (“1–5”), or when two consecutive hyphens are found. Em dash is use when three consecutive hyphens are found. Both can be used as a break in a sentence (“tipograph – even if it’s just a set of simple rules – can improve typography in your content”). Whether en dash or em dash will be used for this case depends on the setting of the language or it can be overridden by dash: 'en' | 'em'
in tipograph options.
language
This preset only applies language specific rules defined in language given at tipograph instance initialization.
math
Unfortunately, majority of nice mathematical symbols is not present on our keyboard. Where it make sense, tipograph tries to put them instead of their poor substitues. For example, minus sign (that’s right, even minus sign has its special character) instead of hyphen, multiplication sign instead of the letter “x”, etc. Imagine how you would write this formula just by hand: 2 × 3 ≠ 5.
quotes
Nice quotes are probably the most visible feature of correct typography. On our keyboards, we have just these straight one which are pretty ugly. However, tipograph tries to replace them with their correct counterparts – and it even takes language habits into account. Moreover, it attempts to handle apostrophes, inch and foot units symbols, or fix some writers' bad habbits (such as two consecutive commas in order to imitate bottom 99-shaped quotes).
spaces
Even that they are not visible, spaces play important role in typography. Only one word space should be used at a time. Also, in some cases, there should be non-breaking space instead of normal one (for example after some special symbols).
symbols
There are a lot of special symbols which we don’t know how to write and that makes us sad. Instead, we tend to use some substitues for them. And tipograph replaces these substitues with their actual characters, for example copyright or trademark symbols. It also changes “⁇”, “⁈” and “⁉” into ligature counterparts. Also, multiple question marks (more than two) or exclamation points (more than one) are squashed.
custom
If tipograph's rules are not enough for you, you can define your own. Please, consider whether your rule would make sense in tipograph core, and if so, I will gladly accept your contribution.
var custom = function (language) {
// set of rules
return [
// rule is a pair of search value and its replacement
[/-([a-z])/g, function (match, letter) {
return letter.toUpperCase();
}]
];
};
var typo1 = tipograph({ presets: [custom] }); // use only your custom preset
var typo2 = tipograph({ presets: tipograph.extend([custom]) }); // or extend the default presets
Formats
The input might be in a different format than just a plain text and it might be important to take it into account. For example, you don’t want to apply typography rules inside HTML tag. For that case, you can specify the format preprocessor. There are few already made, and again, you can define your own (see format documentation for more details).
html
HTML tags are kept as they are. Moreover, it also preserves whole contents of the following tags: pre, code, style, script.
plain
Input content is preserved as it is.
Postprocessing
Sometimes the special characters need to be replaced with their corresponding macros/entities in an output format, so that the file can be saved as ascii-encoded file or the compiler/interpreter of the format (and the human too) understands it.
html
Special characters are replaced with corresponding HTML entities (in form &entity;).
latex
Special characters are replaced with corresponding LaTeX macros, sometimes wrapped in inline math block.
Changes
It is possible to retrieve the information how the text was changed by tipograph. This can be useful for providing the user with these details or to implement more complex application above tipograph (e.g., WYSIWYG editor). This information is in the form of the collection of pairs where the first item of the pair represents an index slice in the source text and the second items an index slice in the output text. Probably more understandable from an example:
var typo = tipograph();
typo('"lorem --- ipsum"', function (converted, changes) {
// process the changes:
// [
// [[0, 1], [0, 1]], // '"' -> '\u201C'
// [[6, 11], [6, 9]], // '"' -> '\u200a\u2014\u200a'
// [[16, 17], [14, 15]] // '"' -> '\u201D'
// ]
// converted: '\u201Clorem\u200a\u2014\u200aipsum\u201D'
// return the converted text
// this return value becomes the return value of the whole `typo` function
// you can also return the changes
return converted;
});
// stream
fs.createReadStream('input.txt')
.pipe(tipograph.createStream(/*{ options }, */callback))
.pipe(fs.createWriteStream('output.txt'));
Languages
Different languages may have different rules. The most notable example are quotes. There are few predefined languages and you can define your own (see language documentation for more details). The language contains configuration for some presets (at the moment, only quotes) and moreover it contains rules specific for the language. Just don’t forget to include language preset into presets option.
chinese
quotes: 「primary」 | 『secondary』
czech
quotes: „primary“ | ‚secondary‘
After some one-letter prepositions and conjuctions there should be a non-breaking space.
danish
quotes: »primary« | „secondary“
english
quotes: “primary” | ‘secondary’
finnish
quotes: ”primary” | ’secondary’
french
quotes: « primary » | “secondary”
german
quotes: „primary“ | ‚secondary‘
italian
quotes: «primary» | “secondary”
japanese
quotes: 「primary」 | 『secondary』
japanese
quotes: 「primary」 | 『secondary』
polish
quotes: „primary” | «secondary»
portuguese
quotes: “primary” | ‘secondary’
russian
quotes: «primary» | „secondary“
spanish
quotes: «primary» | “secondary”
swedish
quotes: ”primary” | ’secondary’
swiss
quotes: «primary» | ‹secondary›
custom
If you need a language which is not included in tipograph core or you need to make specific changes to a built-in language, you can do so by passing the language object instead of a name. The same as in custom preset case, consider contributing your language to tipograph itself.
var typo = tipograph({
language: {
quotes: [
// french quotes (see src/quotes.js)
[tipograph.quotes.DOUBLE_LEFT_SPACE, tipograph.quotes.DOUBLE_RIGHT_SPACE],
[tipograph.quotes.SINGLE_LEFT_SPACE, tipograph.quotes.SINGLE_RIGHT_SPACE]
],
// same interface as in custom preset
rules: []
}
});
If you want to reuse either quotes or rules definition of an existing language, it is possible using exported languages
property, that is, using tipograph.languages.french.quotes
and tipograph.languages.french.rules
.
Resources
- Practical Typography for the most of the rules in tipograph
- Summary table on Wikipedia for quote symbols in various languages
Contributing
See contributing guide.
License
Tipograph is licensed under MIT. Feel free to use it, contribute or spread the word.