diff --git a/package.json b/package.json index bade0b9bba..b6cf230737 100644 --- a/package.json +++ b/package.json @@ -23,7 +23,7 @@ "jquery.autoellipsis": "https://github.com/pvdspek/jquery.autoellipsis", "jquery.cookie": "1.4.1", "magnific-popup": "1.1.0", - "markdown-it": "8.4.1", + "markdown-it": "10.0.0", "moment": "2.24.0", "moment-timezone": "0.5.25", "moment-timezone-names-translations": "https://github.com/discourse/moment-timezone-names-translations", diff --git a/spec/fixtures/md/spec.txt b/spec/fixtures/md/spec.txt index 08ce0a57ef..50a660fa7e 100644 --- a/spec/fixtures/md/spec.txt +++ b/spec/fixtures/md/spec.txt @@ -1,8 +1,8 @@ --- title: CommonMark Spec author: John MacFarlane -version: 0.28 -date: '2017-08-01' +version: 0.29 +date: '2019-04-06' license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' ... @@ -248,7 +248,7 @@ satisfactory replacement for a spec. Because there is no unambiguous spec, implementations have diverged considerably. As a result, users are often surprised to find that -a document that renders one way on one system (say, a github wiki) +a document that renders one way on one system (say, a GitHub wiki) renders differently on another (say, converting to docbook using pandoc). To make matters worse, because nothing in Markdown counts as a "syntax error," the divergence often isn't discovered right away. @@ -326,10 +326,15 @@ A [space](@) is `U+0020`. A [non-whitespace character](@) is any character that is not a [whitespace character]. +An [ASCII control character](@) is a character between `U+0000–1F` (both +including) or `U+007F`. + An [ASCII punctuation character](@) is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, -`*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, -`[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. +`*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), +`:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), +`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), +`{`, `|`, `}`, or `~` (U+007B–007E). A [punctuation character](@) is an [ASCII punctuation character] or anything in @@ -476,6 +481,347 @@ bar For security reasons, the Unicode character `U+0000` must be replaced with the REPLACEMENT CHARACTER (`U+FFFD`). + +## Backslash escapes + +Any ASCII punctuation character may be backslash-escaped: + +```````````````````````````````` example +\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ +. +
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
+```````````````````````````````` + + +Backslashes before other characters are treated as literal +backslashes: + +```````````````````````````````` example +\→\A\a\ \3\φ\« +. +\→\A\a\ \3\φ\«
+```````````````````````````````` + + +Escaped characters are treated as regular characters and do +not have their usual Markdown meanings: + +```````````````````````````````` example +\*not emphasized* +\*not emphasized* +<br/> not a tag +[not a link](/foo) +`not code` +1. not a list +* not a list +# not a heading +[foo]: /url "not a reference" +ö not a character entity
+```````````````````````````````` + + +If a backslash is itself escaped, the following character is not: + +```````````````````````````````` example +\\*emphasis* +. +\emphasis
+```````````````````````````````` + + +A backslash at the end of the line is a [hard line break]: + +```````````````````````````````` example +foo\ +bar +. +foo
+bar
\[\`
\[\]
+
+````````````````````````````````
+
+
+```````````````````````````````` example
+~~~
+\[\]
+~~~
+.
+\[\]
+
+````````````````````````````````
+
+
+```````````````````````````````` example
+foo
+
+````````````````````````````````
+
+
+## Entity and numeric character references
+
+Valid HTML entity references and numeric character references
+can be used in place of the corresponding Unicode character,
+with the following exceptions:
+
+- Entity and character references are not recognized in code
+ blocks and code spans.
+
+- Entity and character references cannot stand in place of
+ special characters that define structural elements in
+ CommonMark. For example, although `*` can be used
+ in place of a literal `*` character, `*` cannot replace
+ `*` in emphasis delimiters, bullet list markers, or thematic
+ breaks.
+
+Conforming CommonMark parsers need not store information about
+whether a particular character was represented in the source
+using a Unicode character or an entity reference.
+
+[Entity references](@) consist of `&` + any of the valid
+HTML5 entity names + `;`. The
+document & © Æ Ď +¾ ℋ ⅆ +∲ ≧̸
+```````````````````````````````` + + +[Decimal numeric character +references](@) +consist of `` + a string of 1--7 arabic digits + `;`. A +numeric character reference is parsed as the corresponding +Unicode character. Invalid Unicode code points will be replaced by +the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, +the code point `U+0000` will also be replaced by `U+FFFD`. + +```````````````````````````````` example +# Ӓ Ϡ +. +# Ӓ Ϡ �
+```````````````````````````````` + + +[Hexadecimal numeric character +references](@) consist of `` + +either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. +They too are parsed as the corresponding Unicode character (this +time specified with a hexadecimal numeral instead of decimal). + +```````````````````````````````` example +" ആ ಫ +. +" ആ ಫ
+```````````````````````````````` + + +Here are some nonentities: + +```````````````````````````````` example +  &x; + +abcdef0; +&ThisIsNotDefined; &hi?; +. +  &x; &#; &#x; +� +&#abcdef0; +&ThisIsNotDefined; &hi?;
+```````````````````````````````` + + +Although HTML5 does accept some entity references +without a trailing semicolon (such as `©`), these are not +recognized here, because it makes the grammar too ambiguous: + +```````````````````````````````` example +© +. +©
+```````````````````````````````` + + +Strings that are not on the list of HTML5 named entities are not +recognized as entity references either: + +```````````````````````````````` example +&MadeUpEntity; +. +&MadeUpEntity;
+```````````````````````````````` + + +Entity and numeric character references are recognized in any +context besides code spans or code blocks, including +URLs, [link titles], and [fenced code block][] [info strings]: + +```````````````````````````````` example + +. + +```````````````````````````````` + + +```````````````````````````````` example +[foo](/föö "föö") +. + +```````````````````````````````` + + +```````````````````````````````` example +[foo] + +[foo]: /föö "föö" +. + +```````````````````````````````` + + +```````````````````````````````` example +``` föö +foo +``` +. +foo
+
+````````````````````````````````
+
+
+Entity and numeric character references are treated as literal
+text in code spans and code blocks:
+
+```````````````````````````````` example
+`föö`
+.
+föö
föfö
+
+````````````````````````````````
+
+
+Entity and numeric character references cannot be used
+in place of symbols indicating structure in CommonMark
+documents.
+
+```````````````````````````````` example
+*foo*
+*foo*
+.
+*foo* +foo
+```````````````````````````````` + +```````````````````````````````` example +* foo + +* foo +. +* foo
+foo + +bar
+```````````````````````````````` + +```````````````````````````````` example + foo +. +→foo
+```````````````````````````````` + + +```````````````````````````````` example +[a](url "tit") +. +[a](url "tit")
+```````````````````````````````` + + + # Blocks and inlines We can think of a document as a sequence of @@ -514,8 +860,8 @@ one block element does not affect the inline parsing of any other. ## Container blocks and leaf blocks We can divide blocks into two types: -[container block](@)s, -which can contain other blocks, and [leaf block](@)s, +[container blocks](@), +which can contain other blocks, and [leaf blocks](@), which cannot. # Leaf blocks @@ -825,7 +1171,7 @@ Contents are parsed as inlines: ```````````````````````````````` -Leading and trailing blanks are ignored in parsing inline content: +Leading and trailing [whitespace] is ignored in parsing inline content: ```````````````````````````````` example # foo @@ -1024,6 +1370,20 @@ baz* baz ```````````````````````````````` +The contents are the result of parsing the headings's raw +content as inlines. The heading's raw content is formed by +concatenating the lines and removing initial and final +[whitespace]. + +```````````````````````````````` example + Foo *bar +baz*→ +==== +. +
+
aaa
foo
+
+````````````````````````````````
+
+
Closing code fences cannot have [info strings]:
```````````````````````````````` example
@@ -1991,14 +2365,15 @@ Closing code fences cannot have [info strings]:
An [HTML block](@) is a group of lines that is treated
as raw HTML (and will not be escaped in HTML output).
-There are seven kinds of [HTML block], which can be defined
-by their start and end conditions. The block begins with a line that
-meets a [start condition](@) (after up to three spaces
-optional indentation). It ends with the first subsequent line that
-meets a matching [end condition](@), or the last line of
-the document or other [container block]), if no line is encountered that meets the
-[end condition]. If the first line meets both the [start condition]
-and the [end condition], the block will contain just that line.
+There are seven kinds of [HTML block], which can be defined by their
+start and end conditions. The block begins with a line that meets a
+[start condition](@) (after up to three spaces optional indentation).
+It ends with the first subsequent line that meets a matching [end
+condition](@), or the last line of the document, or the last line of
+the [container block](#container-blocks) containing the current HTML
+block, if no line is encountered that meets the [end condition]. If
+the first line meets both the [start condition] and the [end
+condition], the block will contain just that line.
1. **Start condition:** line begins with the string `