# latex2exp: a package to render LaTeX in R graphics

This month, I pushed a big update to
**latex2exp**, bumping it to
version 0.9.

latex2exp is the R package for rendering LaTeX-formatted text in any plotting context. This package lets the user add formatted text (e.g. bold, italic, underline, different font sizes…), mathematical symbols, and equations to plots. Although this capability exists in base R, latex2exp makes it more accessible by exposing it via LaTeX, which is the most common standard for typesetting mathematical formulas.

The 0.9.3 update massively increased the number of LaTeX commands recognized, reworked the parser to work correctly in circumstances where the old parser would fail, improved documentation, and expanded its test suite to include a large number of sample expressions, seen-in-the-wild uses of latex2exp, and edge cases.

In this post, I will outline briefly about how latex2exp came to be, how it works, what changed and what the roadmap for the package looks like.

## Typesetting formatted text and formulas in base R

The primary way to add formatted text and mathematical notation in base
R is to use **plotmath expressions**.

plotmath
(described in `?plotmath`

) is a DSL that comprises
expressions that obey the syntax of regular R code, but are specially
interpreted by a renderer. An expressions such as

```
expression(lim(frac(f(x+h[i])-f(x), h[i]), h[i] %->% 0))
```

combines function calls, subscripts, and operators like `%->%`

to
produce a valid, unevaluated R expression. Each operand and token is
then interpreted by the renderer at plotting time; for instance, the
expression above is rendered as

Plotmath expressions can be used as plot annotations, titles, axis and legend labels in base R, lattice, and ggplot2 graphics.

This system has a number of disadvantages:

**The syntax is unfamiliar to most R users**LaTeX is the de-facto standard for typesetting mathematical equations. Although the plotmath system is somewhat reminiscent of LaTeX, anecdotally, new and experienced R users find the plotmath syntax unnecessarily different and complicated. While it is an interesting application of R’s ability to do metaprogramming, in practice, this DSL does not bring tangible advantages over a string containing LaTeX notation.**Equations have to be valid R syntax.**Because a plotmath expression has to be parsable, a number of equations need workarounds in order to be written as valid R expressions. Those equations typically need workarounds combining`{}`

(braces are invisible groups),`phantom()`

(an invisible token) and`*`

(which juxtaposes the operands).

For example, the equation `a < b < c`

is not valid R code; the correct
way to typeset the equation would be
`expression({a < b} * {phantom() < c})`

. 3. **It is not easily
extensible.** As far as I can tell, there are no hooks for introducing
new symbols and functions. 4. **The quality of the output is… not
ideal.** Compared to a full LaTeX typesetter, the rendered output is
merely passable. Limitations include the inability to italicize symbols
and greek letters; typeset and align multi-line equations; and, in some
cases, properly resize symbols in presence of equations containing tall
elements, resulting in artifacts.

latex2exp tries to address (1)-(3) by providing an easier-to-use interface to plotmath’s capabilities. Because it is just a translational layer on plotmath, it cannot improve on the quality of the typesetting, although in many situations it can produce a higher-quality plotmath representation than a hand-written expression.

### An interactive demo

Here’s an interactive demo! Enter a LaTeX expression into the text box
on the right panel to preview how **latex2exp** will render it.

## How latex2exp works

latex2exp, via its main function `latex2exp::TeX()`

, parses an input
LaTeX string and tries to convert it to a plotmath expression. It does
that by scanning the LaTeX string looking for various types of tokens,
and translating them into the plotmath representation that is visually
closest to the ideal rendering.

For some expressions, the translation is straightforward. For example,

```
TeX(r"(\alpha)")
# is equivalent to
expression(alpha)
```

For others, the translation is not so straightforward, and latex2exp translates the LaTeX string into the closest equivalent, which might be a relatively complicated plotmath expression:

```
TeX(r"($\frac{ih}{2\pi} \frac{d}{dt} \ket{\Psi(t)} = \hat{H}\ket{\Psi(t)}$)")
# is equivalent to
expression(
frac(ih, 2*pi) * phantom(.) *
frac(d, dt) * phantom(.) *
group('|', Psi(t), rangle) ==
hat(H) * group('|', Psi(t), rangle))
)
```

A common source of bugs for new users of latex2exp is that the backslash
character inside strings is used to start an escape sequence, such as
`\n`

(newline), `\t`

(tabs), or `\unnnn`

(a Unicode character). So, a
user attempting to write a TeX string such as `"\Psi"`

will be greeted
by the error

```
Error: '\P' is an unrecognized escape in character string starting ""\P"
```

Prior to R 4.0, the only available option was to escape the backslash
character using a double-backslash, e.g. `"\\Psi"`

, a surprising first
hurdle for new users of latex2exp.

As of R 4.0, it is possible to use unquoted backslashes in a string, if
the string is marked as a **raw string**, e.g. `r"(\Psi)"`

. Raw strings
are written as `r"(...)"`

with `...`

being any character sequence (see
`?Quotes`

for a description of raw strings). I recommend using raw
strings when using the package (unless using R > 4.0 is not possible).

At any point, it is possible to obtain a quick preview of the output of
the call to TeX by calling `plot()`

on the returned value:

```
e <- TeX(r"($\sum_{i=1}^{N} x_i$)")
plot(e, cex=3)
```

Finally, the package website has a filterable
table
of supported LaTeX commands and a preview of how they will be rendered.
The same table can also be invoked anytime from the R prompt using the
command `latex2exp_supported()`

.

## What’s next?

As far as I can tell, the current version (0.9.3) of latex2exp supports all the LaTeX commands that can be feasibly rendered via plotmath (and a few more are “emulated” using some trickery).

The goal for the 1.x branch of latex2exp is to improve on workflows that are unnecessarily error prone, add new features that are hard or impossible to achieve with plotmath, and extend the range of examples provided in the documentation.

### Specialized annotation geoms for ggplot and ggrepel

I would like to support ggplot more directly. Currently, for ggplot
geoms like `geom_text`

, `geom_label`

, and `annotate`

that take a `label`

aesthetic, the input vector for the label aesthetic is expected to be of
type character, rather than of type expression. In order to plot
formatted text and formulas, the user is expected to pass a character
representation of the plotmath expression and remember to specify the
parameter `parse=TRUE`

to force parsing of the expression.

This means that in order to use `TeX`

with these functions, it is
necessary to use the parameter `TeX(..., output="character")`

to force
it to return the character representation of the expression:

```
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() +
annotate("text", x = 4, y = 25,
label = TeX(r"(\alpha+\beta)", output="character"),
parse=TRUE)
```

This is unintuitive, error-prone, and unnecessarily verbose.

I propose to add the functions: `geom_text_TeX`

, `geom_label_TeX`

, and
`annotate_TeX`

. These functions will forward to the underlying ggplot
functions, and set the appropriate parameters for the user, such that:

```
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() +
annotate_TeX(x = 4, y = 25, label = r"(\alpha+\beta)")
```

will automatically parse the TeX input, turn it into an expression, and
forward it to `annotate`

with `parse=TRUE`

.

### Multi-line equations

There is currently no way to introduce a line break in an equation.

I can see two possibilities for achieving this in a future 1.x version:

- Break the expression into separate labels, each containining a line of equation, and compute positioning coordinates that achieve the correct layout and alignment;
- Or, exploiting an undocumented (as far as I know) behavior of
plotmath, it appears to be possible to liberal use of the
`atop()`

plotmath function to stack multiple equations on top of each other, wrapped in a call to`displaysize()`

to ensure each line is displayed with the correct font size.

The latter produces a passable-in-a-pinch output where each line is center-aligned:

### Gallery of examples

I am accumulating a few examples of usages of latex2exp – both from my own work, and in the wild – that can be used to showcase what the package can do for publication-quality plots.