This module implements a reStructuredText (RST) parser. A large subset is implemented with some limitations and Nim-specific features. A few extra features of the Markdown syntax are also supported.
Nim can output the result to HTML [1] or Latex [2].
If you are new to RST please consider reading the following:
- a short quick introduction
- an RST reference: a comprehensive cheatsheet for RST
- a more formal 50-page RST specification.
Features
Supported standard RST features:
- body elements
- sections
- transitions
- paragraphs
- bullet lists using +, *, -
- enumerated lists using arabic numerals or alphabet characters: 1. ... 2. ... or a. ... b. ... or A. ... B. ...
- footnotes (including manually numbered, auto-numbered, auto-numbered with label, and auto-symbol footnotes) and citations
- definition lists
- field lists
- option lists
- indented literal blocks
- simple tables
- directives (see official documentation in RST directives list):
- image, figure for including images and videos
- code
- contents (table of contents), container, raw
- include
- admonitions: "attention", "caution", "danger", "error", "hint", "important", "note", "tip", "warning", "admonition"
- substitution definitions: replace and image
- comments
- inline markup
- emphasis, strong emphasis, inline literals, hyperlink references (including embedded URI), substitution references, standalone hyperlinks, internal links (inline and outline)
- `interpreted text` with roles :literal:, :strong:, emphasis, :sub:/:subscript:, :sup:/:superscript: (see RST roles list for description).
- inline internal targets
Additional Nim-specific features:
- directives: code-block [cmp:Sphinx], title, index [cmp:Sphinx]
- predefined roles
- :nim: (default), :c: (C programming language), :python:, :yaml:, :java:, :cpp: (C++), :csharp (C#). That is every language that highlite supports. They turn on appropriate syntax highlighting in inline code.Note: default role for Nim files is :nim:, for *.rst it's currently :literal:.
- generic command line highlighting roles:
- :cmd: for commands and common shells syntax
- :console: the same for interactive sessions (commands should be prepended by $)
- :program: for executable names [cmp:Sphinx] (one can just use :cmd: on single word)
- :option: for command line options [cmp:Sphinx]
- :tok:, a role for highlighting of programming language tokens
- :nim: (default), :c: (C programming language), :python:, :yaml:, :java:, :cpp: (C++), :csharp (C#). That is every language that highlite supports. They turn on appropriate syntax highlighting in inline code.
- triple emphasis (bold and italic) using ***
- :idx: role for `interpreted text` to include the link to this text into an index (example: Nim index).
double slash // in option lists serves as a prefix for any option that starts from a word (without any leading symbols like -, --, /):
//compile compile the project //doc generate documentation
Here the dummy // will disappear, while options compile and doc will be left in the final document.
Optional additional features, turned on by options: RstParseOption in rstParse proc:
- emoji / smiley symbols
- Markdown tables
- Markdown code blocks
- Markdown links
- Markdown headlines
- using 1 as auto-enumerator in enumerated lists like RST # (auto-enumerator 1 can not be used with # in the same list)
Idiosyncrasies
Currently we do not aim at 100% Markdown or RST compatibility in inline markup recognition rules because that would provide very little user value. This parser has 2 modes for inline markup:
- Markdown-like mode which is enabled by roPreferMarkdown option (turned on by default).Note: RST features like directives are still turned on
- Compatibility mode which is RST rules.
- backticks (code) identically:
- backslash does not escape; the only exception: \ folowed by ` does escape so that we can always input a single backtick ` in inline code. However that makes impossible to input code with \ at the end in single backticks, one must use double backticks:
`\` -- WRONG ``\`` -- GOOD So single backticks can always be input: `\`` will turn to ` code
Limitations
- no Unicode support in character width calculations
- body elements
- no roman numerals in enumerated lists
- no quoted literal blocks
- no doctest blocks
- no grid tables
- some directives are missing (check official RST directives list): parsed-literal, sidebar, topic, math, rubric, epigraph, highlights, pull-quote, compound, table, csv-table, list-table, section-numbering, header, footer, meta, class
- no role directives and no custom interpreted text roles
- some standard roles are not supported (check RST roles list)
- no generic admonition support
- inline markup
- no simple-inline-markup
- no embedded aliases
Usage
See Nim DocGen Tools Guide for the details about nim doc, nim rst2html and nim rst2tex commands.
See packages/docutils/rstgen module to know how to generate HTML or Latex strings to embed them into your documents.
Types
EParseError = object of ValueError
- Source Edit
FindFileHandler = proc (filename: string): string {.closure, ...gcsafe.}
- Source Edit
MsgHandler = proc (filename: string; line, col: int; msgKind: MsgKind; arg: string) {.closure, ...gcsafe.}
- what to do in case of an error Source Edit
MsgKind = enum meCannotOpenFile = "cannot open \'$1\'", meExpected = "\'$1\' expected", meGridTableNotImplemented = "grid table is not implemented", meMarkdownIllformedTable = "illformed delimiter row of a Markdown table", meNewSectionExpected = "new section expected $1", meGeneralParseError = "general parse error", meInvalidDirective = "invalid directive: \'$1\'", meInvalidField = "invalid field: $1", meFootnoteMismatch = "mismatch in number of footnotes and their refs: $1", mwRedefinitionOfLabel = "redefinition of label \'$1\'", mwUnknownSubstitution = "unknown substitution \'$1\'", mwBrokenLink = "broken link \'$1\'", mwUnsupportedLanguage = "language \'$1\' not supported", mwUnsupportedField = "field \'$1\' not supported", mwRstStyle = "RST style: $1", meSandboxedDirective = "disabled directive: \'$1\'"
- the possible messages Source Edit
RstFileTable = object filenameToIdx*: Table[string, FileIndex] idxToFilename*: seq[string]
- Source Edit
RstParseOption = enum roSupportSmilies, ## make the RST parser support smilies like ``:)`` roSupportRawDirective, ## support the ``raw`` directive (don't support ## it for sandboxing) roSupportMarkdown, ## support additional features of Markdown roPreferMarkdown, ## parse as Markdown (keeping RST as "extension" ## to Markdown) -- implies `roSupportMarkdown` roNimFile, ## set for Nim files where default interpreted ## text role should be :nim: roSandboxDisabled ## this option enables certain options ## (e.g. raw, include) ## which are disabled by default as they can ## enable users to read arbitrary data and ## perform XSS if the parser is used in a web ## app.
- options for the RST parser Source Edit
Consts
ColRstInit = 0
- Initial column number for standalone RST text (Nim global reporting adds ColOffset=1) Source Edit
ColRstOffset = 1
- 1: a replica of ColOffset for internal use Source Edit
LineRstInit = 1
- Initial line number for standalone RST text Source Edit
Procs
proc defaultFindFile(filename: string): string {....raises: [], tags: [ReadDirEffect].}
- Source Edit
proc defaultMsgHandler(filename: string; line, col: int; msgkind: MsgKind; arg: string) {....raises: [ValueError, EParseError, IOError], tags: [WriteIOEffect].}
- Source Edit
proc getArgument(n: PRstNode): string {....raises: [], tags: [].}
- Source Edit
proc getFieldValue(n: PRstNode): string {....raises: [], tags: [].}
-
Returns the value of a specific rnField node.
This proc will assert if the node is not of the expected type. The empty string will be returned as a minimum. Any value in the rst will be stripped form leading/trailing whitespace.
Source Edit proc rstMessage(filenames: RstFileTable; f: MsgHandler; info: TLineInfo; msgKind: MsgKind; arg: string) {....raises: [ValueError], tags: [].}
- Print warnings using info, i.e. in 2nd-pass warnings for footnotes/substitutions/references or from rstgen.nim. Source Edit
proc rstnodeToRefname(n: PRstNode): string {....raises: [], tags: [].}
- Source Edit
proc rstParse(text, filename: string; line, column: int; options: RstParseOptions; findFile: FindFileHandler = nil; msgHandler: MsgHandler = nil): tuple[node: PRstNode, filenames: RstFileTable, hasToc: bool] {....raises: [Exception, ValueError], tags: [RootEffect, ReadEnvEffect].}
- Parses the whole text. The result is ready for rstgen.renderRstToOut, note that 2nd tuple element should be fed to initRstGenerator argument filenames (it is being filled here at least with filename and possibly with other files from RST .. include:: statement). Source Edit
proc whichMsgClass(k: MsgKind): MsgClass {....raises: [], tags: [].}
- returns which message class k belongs to. Source Edit