todotxt-parser

v1.0.2

Published

2 years ago

A parser for Gina Trapani's todo.txt format with optional extended features

Downloads

0High
0Medium
0Low

csauve

todo.txt parser

todotxt-parser

This is a Node.js module for parsing the todo.txt format created by Gina Trapani. A variety of configuration options allow it to parse a strict canonical todo.txt format, or a more relaxed version permitting more liberal whitespace, comments, user-defined metadata extensions, and even indented hierarchical tasks with metadata inheritance.

About the Format

The todo.txt format attempts to maintain all the benefits of portable, human-readable flat files but still provide structured metadata for tools built on the format. For example, your todo.txt might look like this:

(A) Thank Mom for the meatballs @phone
(B) Schedule Goodwill pickup +GarageSale @phone
Post signs around the neighborhood +GarageSale
@GroceryStore Eskimo pies
Submit expense report for work travel due:2015-01-25
x 2015-01-10 See the new exhibit at the museum

Each line in the file is one task, and tasks can have priority ((A)), projects (+GarageSale), contexts (@phone), dates, and other metadata attached to them. Priority, project, and context are 3 main sliceable axes in an effective todo list. See the todo.txt-cli wiki for full description of the format.

Installation

$ npm install todotxt-parser

API

The API consumes a multilined string and returns an array of task objects in the order they appeared in the input:

var parser = require("todotxt-parser");
var tasks = parser.relaxed("x 2014-07-04 (A) 2014-06-19 Document YTD spending on +SocialEvents for @Alex due:2014-08-01");

tasks looks like this:

[
  {
    // the original untrimmed content of the line
    "raw": "x 2014-07-04 (A) 2014-06-19 Document YTD spending on +SocialEvents for @Alex due:2014-08-01",
    // the trimmed content of the line following the creation date
    "text": "Document YTD spending on +SocialEvents for @Alex due:2014-08-01",
    /* projects are found in the `text` field and begin with "+".
     * Empty when none present */
    "projects": ["SocialEvents"],
    /* contexts are found in the `text` field and begin with "@".
     * Empty when none present */
    "contexts": ["Alex"],
    // indicates if the task is marked as completed
    "complete": true,
    // ISO 8601 UTC datetime. Null if not present
    "dateCreated": "2014-06-19T00:00:00.000Z",
    // ISO 8601 UTC datetime. Null if not present
    "dateCompleted": "2014-07-04T00:00:00.000Z",
    /* The upper case A-Z priority. If priority was not
     * explicitly given, `metadata.pri` will be used if
     * it's present. Otherwise null
     */
    "priority": "A",
    // Stores data parsed by metadata extensions. Defaults to {}
    "metadata": {"due": "2014-08-01"},
    /* In hierarchical mode, contains any direct children
     * at a higher indentation level
     */
    "subtasks": [],
    /* Indentation level of the task in character columns.
     * See hierarchical mode for more details
     */
    "indentLevel": 2
  }
]

There are two parsing functions availalbe: parser.relaxed(input, options) and parser.parse(input, options). Options can be omitted or partial, and these functions only differ in their default options. Calling parser.relaxed(input) is equivalent to calling parser.parse(input, parser.options.RELAXED). The default options for parser.parse(input) are equal to parser.options.CANONICAL.

Options

(Examples in CoffeeScript)

The exposed default options are as follows:

options:
  # Gina Trapani's todo.txt-cli format & implementation
  CANONICAL:
    dateParser: (s) -> new Date(s).toJSON()
    dateRegex: /\d{4}-\d{2}-\d{2}/
    relaxedWhitespace: false
    requireCompletionDate: true
    ignorePriorityCase: false
    heirarchical: false
    inherit: false
    commentRegex: null
    projectRegex: /(?:\s|^)\+(\S+)/g
    contextRegex: /(?:\s|^)@(\S+)/g
    extensions: []
  RELAXED:
    dateParser: (s) -> new Date(s).toJSON()
    dateRegex: /\d{4}-\d{2}-\d{2}/
    relaxedWhitespace: true
    requireCompletionDate: false
    ignorePriorityCase: true
    heirarchical: false
    inherit: false
    commentRegex: /^\s*#.*$/
    projectRegex: /(?:\s+|^)\+(\S+)/g
    contextRegex: /(?:\s+|^)@(\S+)/g
    extensions: [
      (text) ->
        metadata = {}
        metadataRegex = /(?:\s+|^)(\S+):(\S+)/g
        while match = metadataRegex.exec text
          metadata[match[1].toLowerCase()] = match[2]
        metadata
    ]

dateParser

A function accepting a string and returning a string, used to convert captured dates for the dateCreated and dateCompleted fields. It is recommended to return an ISO 8601 UTC datetime for consistency with the default date parser:

(s) -> new Date(s).toJSON()

dateRegex

A RegExp used to match the creation and completion dates. It should not contain any capture groups, and any modifiers (like case insensitivity) will be ignored. Matches will be parsed by the dateParser function. This option defaults to capturing "YYYY-MM-DD" format:

/\d{4}-\d{2}-\d{2}/

relaxedWhitespace

The todo.txt specification does not allow for more than 1 space between the completion mark, completion date, priority, creation date, and text. This ensures priorities and tasks line up so lines can be sorted consistently. When relaxedWhitespace is set to true, these restrictions are lifted.

# none of these longer whitespace gaps would have been valid
parser.parse "x   2013-11-11   (B)   2013-10-11   Clean up",
  relaxedWhitespace: true
# with `relaxedWhitespace`, this is allowed now
parser.parse "    Task B",
  relaxedWhitespace: true

requireCompletionDate

A task is marked completed by adding a lower case "x" marker to the start of the line, followed by a single space and then a completion date. Changing 'requireCompletionDate' to false makes the date optional, allowing tasks like this:

parser.parse "x Walk the dog",
  requireCompletionDate: false

Note: It is possible for a tasks creation date to become its completion date with this option disabled:

# this date will become the creation date
parser.parse "2014-12-02 Task A",
  requireCompletionDate: false

# but now it is the completion date
parser.parse "x 2014-12-02 Task A",
  requireCompletionDate: false

# a priority clears the ambiguity; it's now the creation date
parser.parse "x (A) 2014-12-02 Task A",
  requireCompletionDate: false

ignorePriorityCase

When set to true, both A-Z and a-z will be allowed for priority. The priority is still always converted to upper case after capture.

hierarchical

Standard todo.txt has no notion of subtasks. Indentation is not allowed because the result is no longer sortable in a meaningful way. If you want to group a set of tasks under one project, each task needs to be annotated with the same +Project tag. This can clutter large projects, and it's difficult to see at a glance which tasks are associated by project. If the ability to sort lines alphabetically is not important to you, and you would rather be able to logically group tasks under other tasks, then there is hierarchical mode:

tasks = parser.relaxed """
  Task A
    Task B
    Task C
      Task D
      Task E
  Task F
""", hierarchical: true

Instead of all tasks being stored in a single array, like standard mode with relaxedWhitespace: true would return, the subtasks field of each task is now used to store child tasks:

(Fields other than text, indentLevel, and subtasks omitted for brevity)

# parse still returns an array, but it only contains the root level tasks
[
  { text: "Task A", indentLevel: 0, subtasks: [
    { text: "Task B", indentLevel: 2, subtasks: [] }
    { text: "Task C", indentLevel: 2, subtasks: [
      # a task is a leaf when `subtasks` is empty
      { text: "Task D", indentLevel: 4, subtasks: [] }
      { text: "Task E", indentLevel: 4, subtasks: [] }
    ]}
  ]}
  # tasks A and F are siblings
  { text: "Task F", indentLevel: 0, subtasks: [] }
]

Hierarchical mode implies relaxedWhitespace: true. A task is considered a subtask when its indentation level is greater than its parent's. A new parent is chosen when the indentation level is greater than the previous sibling's indentation level. For example, what is the output of this?

tasks = parser.relaxed """
   Task A
  Task B
       Task C
     Task D
       Task E
    Task F
""", hierarchical: true

Tasks A and B will be root level siblings even though they are not indented the same amount. Task B has three subtasks: C, D, and F. Task D has a single subtask, E. Even though tasks E and C are indented the same amount, it's their position relative to the previous task that matters. The best practice is to use consistent indentation.

How is indentLevel determined? There are two rules:

If the line immediately begins with the completion mark "x", then indentLevel counts it and contiguous whitespace characters following it
Otherwise, indentLevel is the number of leading whitespace characters

This means you can either place the completion mark in the first column, or after the indent:

  x Task B
  Task C
    x Task D
    x Task E
    Task F
  Task G

is equivalent to:

x Task B
  Task C
x   Task D
x   Task E
    Task F
  Task G

It's important to note that tasks B and C are siblings. If the intent was to have C be a subtask of B, then the first format should have been used (add at least 1 extra column of leading whitespace).

inherit

The inherit option is only applicable to hierarchical mode, and is disabled by default. When enabled, subtasks will inherit the metadata of their ancestors. This includes projects, contexts, completeness, creation and completion dates, priority, and extension metadata. Subtasks can shadow ancestral metadata by explicitly defining it themselves.

tasks = parser.relaxed """
  (A) 2014-06-19 Task A +Project1 @context1 due:2014-09-13 t:2014-05-01
    Task B +Project2 due:2014-08-15
""", hierarchical: true, inherit: true

Task B will have inhereted task A's metadata:

raw: "  Task B +Project2 due:2014-08-15"
text: "Task B +Project2 due:2014-08-15"
# `projects` and `contexts` are considered sets, so you won't
# get duplicates if they're also found in an ancestor
projects: ["Project 1", "Project 2"]
contexts: ["context1"]
complete: false
datecreated: "2014-06-19T00:00:00.000Z"
dateCompleted: null
priority: "A"
# note that `due` is shadowing the parent's value
metadata: {due: "2014-08-15", t: "2014-05-01"}
subtasks: []
indentLevel: 2

commentRegex

This RegExp tests if the line is a comment, and should therefore be ignored. Comments are not part of the todo.txt specification, so this is null by default.

projectRegex, contextRegex

These two RegExp are used to match projects and contexts only inside the task's text field, which is anything following the creation date. The defaults match +Project and @context.

projectRegex: /(?:\s|^)\+(\S+)/g
contextRegex: /(?:\s|^)@(\S+)/g

When supplying your own expressions, makes sure to have a capture group for the context/project itself, and to enable global matching with the g modifier.

extensions

Extensions are functions that are passed the text field of the task and return an object of key-value metadata. The results of all extensions are merged into a task's metadata field. The order of functions in extensions matters: later functions can overwrite values for a key. No extensions are used by default, but relaxed mode will find any "key:value" pairs with this function:

extensions: [
  (text) ->
    metadata = {}
    metadataRegex = /(?:\s+|^)(\S+):(\S+)/g
    while match = metadataRegex.exec text
      metadata[match[1].toLowerCase()] = match[2]
    metadata
]

Future work

Add a formatter that turns a list or hierarchy of tasks back into a string.

Testing

Use node package manager to install dependencies and run the tests:

  $ npm install
  $ npm test

License

See the LICENSE file (MIT).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

todotxt-parser

About the Format

Installation

API

Options

dateParser

dateRegex

relaxedWhitespace

requireCompletionDate

ignorePriorityCase

hierarchical

inherit

commentRegex

projectRegex, contextRegex

extensions

Future work

Testing

License