MDX Specification

Updated 11022022-072900


Specification

MDX language and abstract syntax tree definitions.

Why?

In order to ensure a vibrant ecosystem and community tooling needs to exist for formatting, linting, and plugins. This requires a foundational specification and abstract syntax tree so that parsing is properly handled before transforming to JSX/Hyperscript/React/Vue/etc and potentially leveraging existing plugin ecosystems.

MDX the format is a syntax that can be implemented and parsed in any number of ways. Here we use the remark/unified ecosystem.

mdx-js/mdx uses remark to parse Markdown into an MDAST which is transpiled to MDXAST. This allows for the rich unified plugin ecosystem to be utilized while also ensuring a more robust parsing implementation by sharing the remark parser library.

How does it work?

The MDX transpilation flow consists of six steps, ultimately resulting in JSX that can be used in React/Preact/etc.

  1. Parse: Text => MDAST
  2. Transpile: MDAST => MDXAST
  3. Transform: MDX/Remark plugins applied to AST
  4. Transpile: MDXAST => MDXHAST
  5. Transform: Hyperscript plugins applied to AST
  6. Transpile: MDXHAST => JSX

MDX

MDX is superset of the CommonMark specification that adds embedded JSX and import/export syntax.

The official media type to label MDX content is text/mdx, and the file extension is .mdx.

Imports

ES2015 import syntax is supported. This can be used to transclude other MDX files or to import React components to render.

MDX transclusion

Shared content can be transcluded by using import syntax and then rendering the component. Imported MDX is transpiled to a React/JSX component at compile time.

import License from './shared/license.md'

## Hello, world!

<License />

React component rendering

import Video from './video'

## Hello, world!

<Video />

Exports

ES2015 export syntax is supported. This can be used to export metadata like layout or authors. It's a mechanism for an imported MDX file to communicate with its parent.

import { fred, sue } from '../data/authors'
import Layout from '../components/with-blog-layout'

export const meta = {
  authors: [fred, sue],
  layout: Layout
}

JSX

In MDX, all embedded markup is interpreted as JSX. Components can be imported for rendering.

Block level

JSX can be written at the block level. This means that it is rendered as a sibling to other root level elements like paragraphs and headings. It must be surrounded by empty newlines.

import { Logo } from './ui'

# Hello, world!

<Logo />

And here's a paragraph

Inline JSX

JSX can be used inline, this is useful for adding labels.

import { Indicator } from './ui'

# Button <Indicator variant="experimental" />

Fragment syntax

JSX blocks can be opened using JSX fragments <>Hello!</>. This is how you can achieve interpolation from props passed in.

Accessing properties

Any JSX block will have access to props. This can be done inline or at the block level.

Inline

# Hello, from <>{props.from}</>!

Block

# Hello, world!

<pre>{JSON.stringify(props, null, 2)}</pre>

Element to component mapping

It's often desirable to map React components to their HTML element equivalents, adding more flexibility to many usages of React that might not want a plain HTML element as the output. This is useful for component-centric projects.

import React from 'react'
import - as ui from './ui'

import Doc from './readme.md'

export default () =>
  <Doc
    components=%0A%20%20%20%20%20%20h1%3A%20ui.Heading%2C%0A%20%20%20%20%20%20p%3A%20ui.Text%2C%0A%20%20%20%20%20%20code%3A%20ui.Code%0A%20%20%20%20
  />

MDXAST

The majority of the MDXAST specification is defined by MDAST. MDXAST is a superset of MDAST, with three additional node types:

  • jsx (in place of html)
  • import
  • export

It's also important to note that an MDX document that contains no JSX or imports is a valid MDAST.

Differences to MDAST

The import type is used to provide the necessary block elements to the remark HTML block parser and for the execution context/implementation. For example, a webpack loader might want to transform an MDX import by appending those imports.

export is used to emit data from MDX, similarly to traditional markdown frontmatter.

The jsx node would most likely be passed to Babel to create functions.

This will also differ a bit in parsing because the remark parser is built to handle particular HTML element types, whereas JSX support will require the ability to parse any tag, and those that self close.

The jsx, import, and export node types are defined below.

AST

JSX

JSX (ElementNode) which contains embedded JSX as a string and children (ElementNode).

interface JSX <: Element {
  type: "jsx";
  value: "string";
  children: [ElementNode]
}

For example, the following MDX:

<Heading hi='there'>
  Hello, world!
</Heading>

Yields:

{
  "type": "jsx",
  "value": "<Heading hi='there'>\n  Hello, world!\n</Heading>"
}

Import

import (Textnode) contains the raw import as a string.

interface JSX <: Text {
  type: "import";
}

For example, the following MDX:

import Video from '../components/Video'

Yields:

{
  "type": "import",
  "value": "import Video from '../components/Video'"
}

Export

export (Textnode) contains the raw export as a string.

interface JSX <: Text {
  type: "export";
}

For example, the following MDX:

export { foo: 'bar' }

Yields:

{
  "type": "export",
  "value": "export { foo: 'bar' }"
}

MDXHAST

The majority of the MDXHAST specification is defined by HAST. MDXHAST is a superset of HAST, with four additional node types:

  • jsx
  • import
  • export
  • inlineCode

It's also important to note that an MDX document that contains no JSX or imports results in a valid HAST.

Plugins

The @mdx-js/mdx implementation is pluggable at multiple stages in the transformation. This allows not only for retext/remark/rehype transformations, but also access to the powerful utility libraries these ecosystems offer.

Remark/MDX processing is async, so any plugin that returns a promise will be awaited.

MDAST plugins

When the MDX library receives text, it uses remark-parse to parse the raw content into an MDAST. MDX then uses a few formatting plugins to ensure the MDAST is cleaned up. At this stage, users have the ability to pass any (optional) plugins to manipulate the MDAST.

const jsx = mdx(myMDX, { mdPlugins: [myPlugin] })

Let's consider the following default remark-images plugin that MDX uses. This plugin automatically turns a paragraph that consists of only an image link into an image node.

const isUrl = require('is-url')
const visit = require('unist-util-visit')

const isImgUrl = str => /\.(svg|png|jpg|jpeg)/.test(str)

module.exports = () => (tree, file) =>
  visit(tree, 'text', node => {
    const text = node.value ? node.value.trim() : ''

    if (!isUrl(text) || !isImgUrl(text)) {
      return
    }

    node.type = 'image'
    node.url = text

    delete node.value
  })

Not bad. The unist-util-visit utility library makes it terse to select nodes we care about, we check it for a few conditions, and manipulate the node if it's what we're looking for.

HAST plugins

HAST plugins operate similarly to MDAST plugins, except they have a different AST spec that defines it.

const jsx = mdx(myMDX, { hastPlugins: [myPlugin] })

Let's consider another example plugin which asynchronously requests an image's size and sets it as attributes on the node.

await mdx(myMDX, {
  hastPlugins: [
    () => tree => {
      const imgPx = selectAll('img', tree).map(async node => {
      const size = await requestImageSize(node.properties.src)
        node.properties.width = size.width
        node.properties.height = size.height
      })

      return Promise.all(imgPx).then(() => tree)
    }
  ]
})

Note that this might want to also take an optional argument for the max width of its container for a truly solid layout, but this is a proof of concept.

Related

This specification documents the original .mdx proposal by Guillermo Rauch (@rauchg).

The following projects, languages, and articles helped to shape MDX either in implementation or inspiration.

Syntax

These projects define the syntax which MDX blends together (MD and JSX).

Parsing and implementation

Libraries

Other

Is your work missing?

If you have related work or prior art we've failed to reference, please open a PR!

Authors

view raw mdx.md hosted with ❤ by GitHub


Specification

MDX language and abstract syntax tree definitions.

Why?

In order to ensure a vibrant ecosystem and community tooling needs to exist for formatting, linting, and plugins. This requires a foundational specification and abstract syntax tree so that parsing is properly handled before transforming to JSX/Hyperscript/React/Vue/etc and potentially leveraging existing plugin ecosystems.

MDX the format is a syntax that can be implemented and parsed in any number of ways. Here we use the remark/unified ecosystem.

mdx-js/mdx uses remark to parse Markdown into an MDAST which is transpiled to MDXAST. This allows for the rich unified plugin ecosystem to be utilized while also ensuring a more robust parsing implementation by sharing the remark parser library.

How does it work?

The MDX transpilation flow consists of six steps, ultimately resulting in JSX that can be used in React/Preact/etc.

  1. Parse: Text => MDAST
  2. Transpile: MDAST => MDXAST
  3. Transform: MDX/Remark plugins applied to AST
  4. Transpile: MDXAST => MDXHAST
  5. Transform: Hyperscript plugins applied to AST
  6. Transpile: MDXHAST => JSX

MDX

MDX is superset of the CommonMark specification that adds embedded JSX and import/export syntax.

The official media type to label MDX content is text/mdx, and the file extension is .mdx.

Imports

ES2015 import syntax is supported. This can be used to transclude other MDX files or to import React components to render.

MDX transclusion

Shared content can be transcluded by using import syntax and then rendering the component. Imported MDX is transpiled to a React/JSX component at compile time.

import License from './shared/license.md'

## Hello, world!

<License />

React component rendering

import Video from './video'

## Hello, world!

<Video />

Exports

ES2015 export syntax is supported. This can be used to export metadata like layout or authors. It's a mechanism for an imported MDX file to communicate with its parent.

import { fred, sue } from '../data/authors'
import Layout from '../components/with-blog-layout'

export const meta = {
  authors: [fred, sue],
  layout: Layout
}

JSX

In MDX, all embedded markup is interpreted as JSX. Components can be imported for rendering.

Block level

JSX can be written at the block level. This means that it is rendered as a sibling to other root level elements like paragraphs and headings. It must be surrounded by empty newlines.

import { Logo } from './ui'

# Hello, world!

<Logo />

And here's a paragraph

Inline JSX

JSX can be used inline, this is useful for adding labels.

import { Indicator } from './ui'

# Button <Indicator variant="experimental" />

Fragment syntax

JSX blocks can be opened using JSX fragments <>Hello!</>. This is how you can achieve interpolation from props passed in.

Accessing properties

Any JSX block will have access to props. This can be done inline or at the block level.

Inline

# Hello, from <>{props.from}</>!

Block

# Hello, world!

<pre>{JSON.stringify(props, null, 2)}</pre>

Element to component mapping

It's often desirable to map React components to their HTML element equivalents, adding more flexibility to many usages of React that might not want a plain HTML element as the output. This is useful for component-centric projects.

import React from 'react'
import - as ui from './ui'

import Doc from './readme.md'

export default () =>
  <Doc
    components=%0A%20%20%20%20%20%20h1%3A%20ui.Heading%2C%0A%20%20%20%20%20%20p%3A%20ui.Text%2C%0A%20%20%20%20%20%20code%3A%20ui.Code%0A%20%20%20%20
  />

MDXAST

The majority of the MDXAST specification is defined by MDAST. MDXAST is a superset of MDAST, with three additional node types:

It's also important to note that an MDX document that contains no JSX or imports is a valid MDAST.

Differences to MDAST

The import type is used to provide the necessary block elements to the remark HTML block parser and for the execution context/implementation. For example, a webpack loader might want to transform an MDX import by appending those imports.

export is used to emit data from MDX, similarly to traditional markdown frontmatter.

The jsx node would most likely be passed to Babel to create functions.

This will also differ a bit in parsing because the remark parser is built to handle particular HTML element types, whereas JSX support will require the ability to parse any tag, and those that self close.

The jsx, import, and export node types are defined below.

AST

JSX

JSX (ElementNode) which contains embedded JSX as a string and children (ElementNode).

interface JSX <: Element {
  type: "jsx";
  value: "string";
  children: [ElementNode]
}

For example, the following MDX:

<Heading hi='there'>
  Hello, world!
</Heading>

Yields:

{
  "type": "jsx",
  "value": "<Heading hi='there'>\n  Hello, world!\n</Heading>"
}

Import

import (Textnode) contains the raw import as a string.

interface JSX <: Text {
  type: "import";
}

For example, the following MDX:

import Video from '../components/Video'

Yields:

{
  "type": "import",
  "value": "import Video from '../components/Video'"
}

Export

export (Textnode) contains the raw export as a string.

interface JSX <: Text {
  type: "export";
}

For example, the following MDX:

export { foo: 'bar' }

Yields:

{
  "type": "export",
  "value": "export { foo: 'bar' }"
}

MDXHAST

The majority of the MDXHAST specification is defined by HAST. MDXHAST is a superset of HAST, with four additional node types:

It's also important to note that an MDX document that contains no JSX or imports results in a valid HAST.

Plugins

The @mdx-js/mdx implementation is pluggable at multiple stages in the transformation. This allows not only for retext/remark/rehype transformations, but also access to the powerful utility libraries these ecosystems offer.

Remark/MDX processing is async, so any plugin that returns a promise will be awaited.

MDAST plugins

When the MDX library receives text, it uses remark-parse to parse the raw content into an MDAST. MDX then uses a few formatting plugins to ensure the MDAST is cleaned up. At this stage, users have the ability to pass any (optional) plugins to manipulate the MDAST.

const jsx = mdx(myMDX, { mdPlugins: [myPlugin] })

Let's consider the following default remark-images plugin that MDX uses. This plugin automatically turns a paragraph that consists of only an image link into an image node.

const isUrl = require('is-url')
const visit = require('unist-util-visit')

const isImgUrl = str => /\.(svg|png|jpg|jpeg)/.test(str)

module.exports = () => (tree, file) =>
  visit(tree, 'text', node => {
    const text = node.value ? node.value.trim() : ''

    if (!isUrl(text) || !isImgUrl(text)) {
      return
    }

    node.type = 'image'
    node.url = text

    delete node.value
  })

Not bad. The unist-util-visit utility library makes it terse to select nodes we care about, we check it for a few conditions, and manipulate the node if it's what we're looking for.

HAST plugins

HAST plugins operate similarly to MDAST plugins, except they have a different AST spec that defines it.

const jsx = mdx(myMDX, { hastPlugins: [myPlugin] })

Let's consider another example plugin which asynchronously requests an image's size and sets it as attributes on the node.

await mdx(myMDX, {
  hastPlugins: [
    () => tree => {
      const imgPx = selectAll('img', tree).map(async node => {
      const size = await requestImageSize(node.properties.src)
        node.properties.width = size.width
        node.properties.height = size.height
      })

      return Promise.all(imgPx).then(() => tree)
    }
  ]
})

Note that this might want to also take an optional argument for the max width of its container for a truly solid layout, but this is a proof of concept.

Related

This specification documents the original .mdx proposal by Guillermo Rauch (@rauchg).

The following projects, languages, and articles helped to shape MDX either in implementation or inspiration.

Syntax

These projects define the syntax which MDX blends together (MD and JSX).

Parsing and implementation

Libraries

Other

Is your work missing?

If you have related work or prior art we've failed to reference, please open a PR!

Authors