In September 2014, a group of well-known developers — Jeff Atwood of Stack Overflow, John MacFarlane of Pandoc, Vicent Marti of GitHub, and a small coalition of collaborators — announced a project called Standard Markdown. Their goal was modest: a rigorous specification for the syntax John Gruber had introduced ten years earlier, with reference implementations in C and JavaScript and a test suite that any parser could be measured against. They were not trying to take Markdown over. They were trying to make it precise enough that two implementations would agree on what a single markdown file meant.
Gruber’s response was to demand they shut down.
Within two weeks the project had been renamed. Standard Markdown became CommonMark. The site was scrubbed of references to the original name. The work continued. Gruber, twelve years later, still does not endorse it.
This is the story of how the most widely used lightweight markup format on the web ended up with no canonical specification, dozens of mutually incompatible flavors, seventeen separate rules just to handle emphasis edge cases, and — as of 2024 — a brand new layer of fragmentation introduced by AI chatbots that each emit markdown in their own house dialect. It is the story of an original sin committed by a brilliant designer who, twenty-two years ago, optimized his format for one thing and one thing only: readability for human writers. It is also the story of why SoloTrillion is shipping AIX Universal Markdown, a specification that does the one thing nobody else has been willing to do.
December 17, 2004
John Gruber published Markdown 1.0 on his blog Daring Fireball on December 17, 2004. The release notes were modest. The download was an 18-kilobyte Perl script. The syntax description was a single page of prose. It began:
Markdown is intended to be as easy-to-read and easy-to-write as is feasible. Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.
That sentence — “publishable as-is, as plain text” — is the foundational design constraint of the entire system. Markdown is not a markup language for machines. It is a writing format for humans who happen to want HTML output at the end. The syntax is borrowed from email conventions because email was, in 2004, the place where most people had encountered ad-hoc plain-text formatting. Asterisks for emphasis, because asterisks look like emphasis. Numbered lists for numbered lists, because that’s how you write a numbered list. Quote marks at the start of a line for blockquotes, because that’s what email clients had been doing for decades.
Gruber’s co-author Aaron Swartz contributed substantially to the early implementation and design discussions. Together they shipped a Perl script and a syntax page and a license, and that was, for the most part, that.
What they did not ship was a specification.
What Gruber Didn’t Decide
The Daring Fireball syntax page is a description, not a specification. It describes the format the way you would describe a cooking technique to a friend: clearly, in prose, with examples, but without the precision required to write a deterministic implementation.
Several things were never normatively defined:
- Sublist indentation. How many spaces are required to indent a nested list item? Two? Three? Four? A tab? It depends on which parser you use.
- Blank lines around block elements. Does a blockquote require a blank line before it? Does a heading? Some parsers say yes; some say no.
- Emphasis precedence. What does
*foo_bar_baz*mean? What about**foo *bar* baz**? What about*foo**bar**baz*? Every parser handles these differently. - List continuation. When does a paragraph belong to the previous list item versus the surrounding document? The indentation rules are underspecified.
- HTML and markdown interaction. Markdown is supposed to pass through inline HTML. But what counts as inline? What about block-level tags inside a markdown list item?
In 2004 these ambiguities were minor. The original Markdown.pl implementation handled them however Gruber and Swartz had handled them in the script, and if you wanted Markdown you used the script and got those answers. The script was buggy in places, but it was the only Markdown that existed, so its bugs were also Markdown’s bugs.
Then other people started writing parsers.
By 2008 there were Markdown parsers in Ruby, Python, JavaScript, C, PHP, and a half dozen other languages. Each one handled the ambiguities slightly differently. Each one made judgment calls about edge cases. None of them agreed.
And because there was no specification anyone could appeal to, none of them was wrong.
September 2014: The Open Letter
Ten years after Markdown’s release, the situation had become untenable. GitHub had begun using Markdown for issue threads, pull request descriptions, and README files. Stack Overflow had begun using it for questions and answers. Discourse had built an entire forum platform on top of it. Hundreds of static site generators, content management systems, and chat applications had adopted it. Every one of these systems had its own slightly different parser, which meant the same markdown file produced different HTML depending on where you pasted it.
In September 2014, Jeff Atwood, who had co-founded Stack Overflow and was now building Discourse, published a blog post titled “Standard Flavored Markdown” announcing a project he had been working on with John MacFarlane, the author of Pandoc. Their goal was a rigorous specification of Markdown’s existing syntax — not a redesign, not an extension, just a precise written description of what Markdown.pl actually did, plus reference implementations in C (cmark) and JavaScript that would parse any markdown the same way.
The project would be called Standard Markdown. The implementations would be open source. The spec would be open for community contribution. The test suite would have several hundred examples covering every corner case Atwood, MacFarlane, and their collaborators (Vicent Marti, Neil Williams, Benjamin Dumke-von der Ehe, and David Greenspan) could think of.
Gruber’s response — sent privately, then reported publicly by Dave Winer and others — was hostile. He called the name infuriating. He objected to the use of the word “Markdown” in any standardization project he had not personally blessed. He demanded that Atwood and MacFarlane rename their project, take down their site, and apologize for the unauthorized use of his trademark.
Atwood and MacFarlane had not, as far as anyone could tell, broken any law. Markdown was released under a BSD-style license that explicitly permitted modification and redistribution. The name “Markdown” had never been formally trademarked. But Gruber was the author, and his moral claim — that the format was his creation and he was entitled to control what was published under its name — had weight. Within two weeks, Standard Markdown had become CommonMark.
The technical work continued unchanged. The reference implementations shipped. The test suite grew. The specification matured. By 2016 CommonMark was the de facto standard for new Markdown implementations, and GitHub had built GitHub Flavored Markdown (GFM) as a strict superset of CommonMark, adding tables, task lists, strikethrough, and autolinks.
Gruber did not endorse CommonMark. Gruber has not endorsed CommonMark since. As far as anyone can tell, Gruber considers CommonMark to be a project he tolerates because he cannot stop it.
The Compromise: CommonMark
CommonMark is the most disciplined attempt at standardizing Markdown that has ever been undertaken. The current specification is 80,000 words long. It has 649 test cases. It defines the parsing of every Markdown construct in terms of explicit, deterministic algorithms.
It is also a compromise.
CommonMark’s design principle was to remain “faithful to Gruber’s Markdown” wherever possible. When the original syntax page was ambiguous, CommonMark chose the interpretation that most closely matched what Markdown.pl actually did. When Markdown.pl itself was inconsistent, CommonMark chose the interpretation that broke the fewest existing documents.
This is the right design principle for a backwards-compatibility-first standard. It is also why CommonMark is more complex than it should be.
The most famous example is emphasis. Markdown allows *foo* for italics and **foo** for bold. But what about *foo_bar*? What about **foo**bar**? What about *foo *bar* baz*? What about emphasis that begins inside a word and ends outside it?
CommonMark’s answer is a section of the specification with seventeen separate rules governing emphasis precedence. They are necessary because the original Markdown left these cases undefined, and different parsers had picked different answers, and CommonMark had to pick an answer that matched the most common existing behavior across the widest range of inputs.
John MacFarlane — CommonMark’s chief architect — wrote an essay in 2020 titled “Beyond Markdown” acknowledging this directly. CommonMark, he wrote, had become “more conservative and complex than I would have liked.” The fidelity-first design had locked the specification into a fixed point of historical accident. A clean-slate replacement, MacFarlane argued, could be simpler — but a clean-slate replacement would also not be Markdown anymore, which meant nobody would adopt it.
The seventeen emphasis rules are the visible scar tissue of an unresolvable design tension: a syntax that wanted to be informal, used at a scale that required formalism, with a creator who refused to participate in the formalization. The complexity is not anyone’s fault. It is the cost of the original sin.
The Flavor Explosion
CommonMark and GFM are the two best-known Markdown specifications, but they are nowhere near the only ones. The CommonMark wiki currently catalogs dozens of documented Markdown flavors, each with its own extensions, omissions, and interpretation rules.
A non-exhaustive list:
- Markdown (Gruber, 2004) — The original. Informal specification only. Reference implementation:
Markdown.pl. - MultiMarkdown (Fletcher Penney, 2005) — Adds tables, footnotes, citations, math support, and metadata blocks. Targeted at academic and technical writing.
- Markdown Extra (Michel Fortin, 2006) — A PHP-flavored extension adding tables, fenced code blocks, abbreviations, definition lists, and ID attributes on headings.
- Pandoc Markdown (John MacFarlane, 2006) — The most feature-rich flavor, designed for academic publishing. Heavy LaTeX integration, citation processing, cross-references, and a vast array of extensions toggleable per document.
- kramdown (Thomas Leitner, 2009) — A Ruby implementation used as the default Markdown processor for Jekyll, GitHub Pages, and many static site generators.
- CommonMark (Atwood, MacFarlane, et al., 2014) — The rigorous parser specification. Minimal feature set; explicit non-goal of feature creep.
- GitHub Flavored Markdown (GitHub, 2017) — Strict superset of CommonMark adding tables, task lists, strikethrough, and autolinks.
- GitLab Flavored Markdown — Similar to GFM but with subtle differences in handling of nested lists and inline HTML.
- R Markdown (RStudio, 2014) — Pandoc Markdown extended with executable code chunks for reproducible data science notebooks.
- Discount, Redcarpet, MMark, Goldmark, marked.js, markdown-it, Showdown, Markdig, blackfriday, mistune, Snudown — at least a dozen more parsers in active use, each making its own judgment calls on the underspecified parts of the syntax.
A project called Babelmark — first Babelmark 1, then Babelmark 2, now Babelmark 3 — exists for the express purpose of demonstrating how different Markdown parsers interpret the same input. You can paste a Markdown snippet into Babelmark and see twenty different rendered outputs side by side, one per parser. They are often dramatically different. The same input. Twenty answers.
This is what happens when a writing format becomes critical infrastructure without a binding specification. Every consumer makes their own decisions, every decision is defensible, and the format gradually becomes a Schelling point with no center — a coordination problem dressed as a syntax.
The AI Era: A New Kind of Chaos
For twenty years, the Markdown variance problem was a parser problem. The same source file produced different output depending on which parser read it. That was bad, but it was a finite badness. The set of parsers was countable. The differences were enumerable. Tools like Babelmark made the variance visible. Standards-conscious authors learned to write in the conservative intersection of CommonMark and GFM, and most of the time everything was fine.
In 2023, this changed.
ChatGPT, Claude, Gemini, and Perplexity began emitting markdown by the megabyte. Every paragraph an AI chatbot produces is, by default, formatted as markdown — bold for emphasis, headings for structure, bullet lists for enumerations, tables for comparisons, code blocks for code. The output is then rendered by the chat interface into styled HTML, and the user reads the styled rendering, and copies the markdown source if they want to take it elsewhere.
But each chatbot generates markdown differently.
A test we conducted in May 2026 — feeding all four major commercial chatbots an identical prompt asking for an overview of five AI agent platforms with paragraph descriptions, bullet lists, and a comparison table — produced four outputs that differed on at least seventeen separately measurable formatting dimensions:
- Item headings: Perplexity used
## LangGraph. Gemini used### 1. LangGraph. ChatGPT used## **1. OpenAI AgentKit**with bold and a leading number. Claude used**1\. LangChain**— bold on its own line as a pseudo-heading, with a backslash-escaped period. - Heading hierarchy: Perplexity went H1 → H2. Gemini skipped H2 entirely, going H1 → H3. ChatGPT went H1 → H2 → H2. Claude refused to use H2 at all and used bold as a heading substitute.
- Bullet markers: Perplexity used hyphens. The other three used asterisks.
- Bullet spacing: Gemini inserted a blank line between every adjacent list item. The other three did not.
- Section separators: Perplexity used
***. ChatGPT used---. Gemini used nothing. Claude used⠀— the Unicode Braille Pattern Blank character (U+2800), which renders invisibly but occupies a line. - Numbered lists in headings: Claude wrote
1\.with a backslash-escaped period. Nobody else did. - Hyphens in prose: Perplexity replaced ASCII hyphens with U+2011 non-breaking hyphens throughout. Grep and search-and-replace on
-will miss them. - Table cell values: Perplexity, Gemini, and ChatGPT used plain text — Yes, No, Partial. Claude’s interface rendered the table with green checkmark glyphs, red X glyphs, and yellow dot glyphs — and when the user copied the output to the clipboard, the entire table was dropped.
That last one is the most consequential finding. Claude’s table existed on the screen. It did not exist in the clipboard. The interface had rendered the table as a styled UI component rather than as markdown text, and the copy operation did not include the component. The markdown the user could share was incomplete — and the user had no way to know unless they ran the test the same way we did.
Claude’s failure was an interface-level render-versus-copy mismatch. Next case is harder: a respected notes app, sitting between the user and the clipboard, eating the markdown contract two different ways depending on what it found.
Bear’s Two Mutations
In May 2026 we ran a controlled follow-up. For two chatbots we captured the same output twice — once piped directly from raw chatbot copy into the Validator, once routed through Bear (a respected notes app) before reaching the Validator.
| Chatbot | Path | Validator result | Issues |
|---|---|---|---|
| Gemini | Raw → Validator | 3 of 6 tests fail | 36 |
| Gemini | Raw → Bear → Validator | 1 of 6 tests fail | 1 |
| Perplexity | Raw → Validator | 2 of 6 tests fail | 65 |
| Perplexity | Raw → Bear → Validator | 2 of 6 tests fail | 50 |
Cleaner-looking Bear results are lies. Bear didn’t fix the markdown. It mutated it.
For Gemini, Bear’s mutation was structural. Asterisk bullets that should have been flagged as Bullet Marker violations came through as plain hyphen-prefixed lines, which conform to AIX by coincidence rather than by intent. The comparison table that should have been examined for cell padding and citation markers didn’t reach the clipboard at all. Thirty-five missing issues didn’t disappear because Bear resolved them; they disappeared because the structure carrying them was deleted in transit.
For Perplexity, Bear’s mutation was character-level. Bullets and table survived intact, but a portion of the non-breaking hyphens (U+2011) got normalized to regular hyphens during the paste cycle. Failure count looks the same (2 of 6), but depth changed — fifteen flagged hyphens vanished, not because Perplexity stopped emitting them but because Bear cleaned them up on the way through.
Two chatbots, two different mutation modes, same effect: the Validator received markdown that no longer faithfully represented what the chatbot emitted. A document with no asterisk bullets cannot fail the bullet marker test. A document with fifteen fewer non-breaking hyphens cannot report on those fifteen hyphens. The Validator was telling the truth about what arrived. What didn’t arrive is the spec’s actual concern.
By Design
Bear is a respected notes app, not a negligent one. It made a design choice years ago: when you copy text out, the clipboard receives normalized prose intended for use elsewhere — not a verbatim copy of Bear’s internal representation, not a roundtrippable markdown serialization. The choice serves Bear’s primary use case — moving readable text between apps — well. It does not serve AIX’s use case, which is structural fidelity, at all.
This isn’t a Bear-specific bug. It’s a class of behavior shared by most of the surface area users trust. Notion converts pasted markdown to its own block format and surfaces a different markdown on copy out. Google Docs flattens everything to its proprietary structure and exports a re-derivation. Obsidian preserves more than most but still applies its own normalization. Even macOS Pasteboard performs conversions when applications declare multiple representations of the same selection. Every tool in the round-trip chain can be a fidelity-failure surface, and the user has no visibility into which surface mutated what.
Bear is both a copy-layer failure example and a reason the AIX Markdown Cleaner exists. Not a contradiction — it’s the structural argument for the spec in a single sentence.
This is a new category of failure. Markdown wars of the 2010s were about parsers disagreeing on inputs. Markdown wars of the 2020s are about three new failure surfaces: generators disagreeing on outputs, interfaces transforming or dropping structural elements on the way to the clipboard, and tools further down the round-trip chain mutating what survives.
CommonMark cannot fix this. GFM cannot fix this. Both are parser specifications. They tell you how to interpret markdown. They are silent on what a generator should produce, and silent on what an interface must preserve.
The seat is empty.
The Empty Seat
Every Markdown standardization effort of the last twenty-two years has been a parser specification. Gruber’s original page. CommonMark. GFM. Pandoc’s extension list. Every one of them answers the question given this markdown, what is the correct HTML.
None of them answers the question what markdown should be emitted in the first place.
This is the gap AIX Universal Markdown fills. It is a generator specification, not a parser specification. It tells AI tools — and any other system producing markdown for downstream consumption — what to emit. It defines canonical forms for the seventeen dimensions where commercial chatbots currently disagree. It specifies what interfaces must preserve when they render. It specifies what copy operations must round-trip without loss.
It does not relitigate any parser ambiguity. CommonMark and GFM are the foundation; AIX Universal Markdown sits on top.
And, deliberately, it does not compete with Gruber’s Markdown. By being explicitly a generator-side specification rather than a parser-side one, AIX Universal Markdown sidesteps the entire 2014 controversy. It is not Markdown 2.0. It is not a fork. It is a complementary standard that says: if you are an AI tool generating markdown for a human to read and re-use, here is what you must emit.
Where SoloTrillion Stands
SoloTrillion’s position is that the era of informal Markdown has ended, whether or not its creator wishes to admit it. AI-generated markdown is now the dominant source of new markdown content on the web. The variance between AI generators is documentable, harmful, and growing. The cost of doing nothing is that every consumer of AI output writes their own ad-hoc cleanup routines, and no two of those routines produce the same result.
We are not the first to identify this problem. We are the first to publish a specification that addresses it explicitly, name it, version it, license it under Apache 2.0, and ship a reference validator and cleaner alongside it.
The AIX Validator on this site will tell you whether a markdown document conforms to AIX Universal Markdown v0.1 — heading hierarchy, bullet markers, list spacing, table structure, special characters, and round-trip integrity. The Markdown Cleaner on this site will take whatever an AI chatbot put on your clipboard and rewrite it into AIX-conformant form.
Both tools work today. The specification is open. Adoption is voluntary. The chaos is optional.
Read the spec. Validate your output. Clean what the chatbots gave you.
Ready to validate?
The AIX Validator checks any markdown document against all six AIX Universal Markdown v0.1 conformance tests.
The AIX Universal Markdown specification is published as version 0.1, May 2026, under the Apache 2.0 license. Heritage: Markdown (Gruber, 2004), CommonMark (2014), GitHub Flavored Markdown (2017), AIX (Ubertrends LLC, 2025). The specification, the Validator, and the Markdown Cleaner are open source and free to use.