Reproducible,
dynamic,
and elegant
books with Quarto

Mine Çetinkaya-Rundel

Duke University + Posit, PBC

“Making” books

The books

Cover of the OpenIntro textbook Introduction to Modern Statistics, 2nd Edition.

Illustration of a red star from the cover of The Little Prince.

Cover of the book R for Data Science, 2nd Edition.

Illustration of a red star from the cover of The Little Prince.

Mockup of cover of the book Quarto - The Definitive Guide.

Illustration of a red star from the cover of The Little Prince.

Cover of the OpenIntro textbook Introduction to Modern Statistics, 2nd Edition.

Illustration of a red star from the cover of The Little Prince.

Illustration of a gold star from the cover of The Little Prince. multiple outputs

Illustration of a gold star from the cover of The Little Prince. accessibility checks

Two outputs

HTML

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in HTML in light mode.

PDF

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in PDF.

From one source

data-hello.qmd

::: {.chapterintro data-latex=""}
Scientists seek to answer questions using rigorous methods and careful observations.
These observations -- collected from the likes of field notes, surveys, and experiments -- form the backbone of a statistical investigation and are called **data**.
Statistics is the study of how best to collect, analyze, and draw conclusions from data.
In this first chapter, we focus on both the properties of data and on the collection of data.
:::

With the help of meticulous styling

With SCSS for HTML:

ims-style.scss

.chapterintro {
  padding: 1em 1em 1em 4em;
  margin-bottom: 10px;
  background: #d5e6ef 5px center/3em no-repeat;
  border-top: 3px solid #569BBD;
  border-bottom: 3px solid #569BBD;
  background-image: url("images/_icons/chapterintro.png");
  background-position: 0.5em 1.5em;
}

With the help of meticulous styling

With TeX for PDF:

ims-style.tex

\newenvironment{mdframedwithfootChapterintro}
{   
    \savenotes
    \begin{mdframed}[%
    topline=true, bottomline=true, linecolor=oiB, linewidth=1.4pt,
    rightline=false, leftline=false,
    backgroundcolor=oiLB]
    \renewcommand{\thempfootnote}{\arabic{footnote}}
    }
{
    \end{mdframed}
    \spewnotes
}

\newenvironment{chapterintro}{
    \vspace{4mm}
    \begin{mdframedwithfootChapterintro}
    \begin{minipage}[t]{0.10\textwidth}
    {$\:$ \\ \setkeys{Gin}{width=2.5em,keepaspectratio}\includegraphics{images/_icons/chapterintro.png}}
    \end{minipage}
    \hfill
    \begin{minipage}[t]{0.90\textwidth}
    \setlength{\parskip}{1em}
    \large
    }{\end{minipage}
    \end{mdframedwithfootChapterintro}
    \vspace{4mm}
}

`_quarto.yml`

_quarto.yml

format:
  html:
    theme:
      light: [cosmo, scss/ims-style.scss]
      dark: [cosmo, scss/ims-style-dark.scss]
    code-link: true
    mainfont: Atkinson Hyperlegible
    monofont: Source Code Pro
    author-meta: "Mine Çetinkaya-Rundel and Johanna Hardin"
    lightbox: 
      match: auto
      loop: false
    fig-dpi: 300
    fig-show: hold
    fig-align: center
  pdf:
    include-in-header: latex/ims-style.tex
    include-after-body: latex/after-body.tex
    documentclass: book
    classoption: 
      - 10pt
      - openany
    pdf-engine: xelatex
    biblio-style: apalike
    keep-tex: true
    block-headings: false
    top-level-division: chapter
    fig-dpi: 300
    fig-show: hold
    fig-pos: H
    tbl-pos: H
    fig-align: center
    toc: true
    toc-depth: 2

Two outputs

HTML

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in HTML in light mode.

PDF

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in PDF.

Three outputs

HTML - Light

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in HTML in light mode.

HTML - Dark

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in HTML in dark mode.

PDF

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in PDF.

With even more meticulous styling

Screenshot of introduction to Chapter 1 of Introduction to Modern Statistics in HTML in dark mode.

ims-style-dark.scss

$body-bg: #222;

.chapterintro {
  padding: 1em 1em 1em 4em;
  margin-bottom: 10px;
  background: lighten($body-bg, 10%) 5px center/3em no-repeat;
  border-top: 3px solid #569BBD;
  border-bottom: 3px solid #569BBD;
  background-image: url("images/_icons/chapterintro.png");
  background-position: 0.5em 1.5em;
}

Unfortunately, it’s not all magic…

Illustration of a gold star from the cover of The Little Prince.

Unfortunately, it’s not all magic…

Illustration of a gold star from the cover of The Little Prince.

Painstakingly add \clearpage that qmd \(\rightarrow\) LaTeX will process and qmd \(\rightarrow\) HTML will ignore:

data-hello.qmd

These two summary statistics are useful in looking for differences in the groups, and we are in for a surprise: an additional 8% of patients in the treatment group had a stroke!
This is important for two reasons.
First, it is contrary to what doctors expected, which was that stents would *reduce* the rate of strokes.
Second, it leads to a statistical question: do the data show a "real" difference between the groups?

\clearpage

This second question is subtle.
Suppose you flip a coin 100 times.
While the chance a coin lands heads in any given coin flip is 50%, we probably won't observe exactly 50 heads.
This type of variation is part of almost any type of data generating process.

Unfortunately, it’s not all magic…

and another…

data-hello.qmd

To answer these questions, data must be collected, such as the `county` dataset shown in @tbl-county-df.
Examining \index{summary statistic}**summary statistics** can provide numerical insights about the specifics of each of these questions.
Alternatively, graphs can be used to visually explore the data, potentially providing more insight than a summary statistic.

\clearpage

\index{scatterplot}**Scatterplots** are one type of graph used to study the relationship between two numerical variables.
@fig-county-multi-unit-homeownership displays the relationship between the variables `homeownership` and `multi_unit`, which is the percent of housing units that are in multi-unit structures (e.g., apartments, condos).
Each point on the plot represents a single county.

Unfortunately, it’s not all magic…

Illustration of a gold star from the cover of The Little Prince.

and another…

data-hello.qmd


\clearpage

## Exercises {#sec-chp1-exercises}

Answers to odd-numbered exercises can be found in [Appendix -@sec-exercise-solutions-01].

Bring back the magic

Illustration of a gold star from the cover of The Little Prince.

By building on things qmd \(\rightarrow\) HTML will happily ignore and qmd \(\rightarrow\) will process: \index{}

data-hello.qmd

We can compute summary statistics from the table to give us a 
better idea of how the impact of the stent treatment differed 
between the two groups.
A **summary statistic** is a single number summarizing data 
from a sample.\index{summary statistic}
For instance, the primary results of the study after 1 year 
could be described by two summary statistics: the proportion 
of people who had a stroke in the treatment and control groups.

In three components

\index{} tags:

data-hello.qmd

We can compute summary statistics from the table to give us a better idea of how the impact of the stent treatment differed between the two groups.
A **summary statistic** is a single number summarizing data from a sample.\index{summary statistic}
For instance, the primary results of the study after 1 year could be described by two summary statistics: the proportion of people who had a stroke in the treatment and control groups.

A .tex file to be appended to the end during render:

after-body.tex

\backmatter
\printindex

Including that file with _quarto.yml:

_quarto.yml

format:
  html:
    ...
  pdf:
    include-in-header: latex/ims-style.tex
    include-after-body: latex/after-body.tex
    ...

Looking forward to `typst` for styling

Illustration of a blue star from the cover of The Little Prince.

TODAY

One source
+
2 style files
\(\downarrow\)
2 outputs

FUTURE

One source
+
1 style file
\(\downarrow\)
2 outputs

Looking forward to `typst` for tables

Illustration of a blue star from the cover of The Little Prince.

TODAY

data-hello.qmd

county |>
  select(name, state, pop2017, pop_change, unemployment_rate, median_edu) |>
  slice_head(n = 6) |>
  kableExtra::kbl(
    linesep = "", 
    booktabs = TRUE,
    format.args = list(big.mark = ",")
  ) |>
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "condensed"),
    latex_options = c("striped")
  )

Looking forward to `typst` for tables

Illustration of a blue star from the cover of The Little Prince.

TODAY

data-hello.qmd

county |>
  select(name, state, pop2017, pop_change, unemployment_rate, median_edu) |>
  slice_head(n = 6) |>
  kableExtra::kbl(
    linesep = "", 
    booktabs = TRUE,
    format.args = list(big.mark = ",")
  ) |>
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "condensed"),
    latex_options = c("striped")
  )

FUTURE

data-hello.qmd

county |>
  select(name, state, pop2017, pop_change, unemployment_rate, median_edu) |>
  slice_head(n = 6) |>
  gt::gt()

Accessibility: `fig-alt`

data-hello.qmd

#| label: fig-county-multi-unit-homeownership
#| ...
#| fig-alt: A scatterplot of homeownership (on the y-axis) versus the percent of
#|   housing units that are in multi-unit structures (on the x-axis) for US
#|   counties. The observation from Chattahoochee County, Georgia
#|   is highlighted as having a multi-unit rate of 39.4% and a
#|   homeownership rate of 31.3%.
ggplot(county, aes(x = multi_unit, y = homeownership)) +
  geom_point(alpha = 0.3, fill = IMSCOL["black", "full"], shape = 21) +
  ...

A scatterplot of homeownership (on the y-axis) versus the percent of housing units that are in multi-unit structures (on the x-axis) for US counties. The observation from Chattahoochee County, Georgia is highlighted as having a multi-unit rate of 39.4% and a homeownership rate of 31.3%.

Do all my figures have `fig-alt`s?

Results for searching for ggplot keyword in the GitHub interface in the repo for Introduction to Modern Statistics. Search finds 46 files contain this text.

Do all my figures have `fig-alt`s?

Results for searching for ggplot keyword in Positron in the folder for Introduction to Modern Statistics. Search finds 44 files contain this text and there are over 400 mentions of it across these files.

Checking for missing `fig-alt`s

Load packages:

# pak::pak("rundel/parsermd") # need dev version
library(parsermd)
library(here)

Checking for missing `fig-alt`s

Find cells that have ggplot() but not fig-alt:

# pak::pak("rundel/parsermd") # need dev version
library(parsermd)
library(here)

missing_fig_alt <- here::here("fig-alt-check/data-hello.qmd") |>
  parse_qmd() |> 
  rmd_select(
    has_type("rmd_chunk") & 
    has_code("ggplot\\(") & 
    !has_option("fig-alt")
  )

Checking for missing `fig-alt`s

Get labels of cells without fig-alt:

# pak::pak("rundel/parsermd") # need dev version
library(parsermd)
library(here)

missing_fig_alt <- parse_qmd(here::here("fig-alt-check/data-hello.qmd")) |> 
  rmd_select(
    has_type("rmd_chunk") & 
    has_code("ggplot\\(") & 
    !has_option("fig-alt")
  )

rmd_node_label(missing_fig_alt)

[1] "fig-county-multi-unit-homeownership"

Checking for missing `fig-alt`s

Get contents of cells without fig-alt:

# pak::pak("rundel/parsermd") # need dev version
library(parsermd)
library(here)

missing_fig_alt <- parse_qmd(here::here("fig-alt-check/data-hello.qmd")) |> 
  rmd_select(
    has_type("rmd_chunk") & 
    has_code("ggplot\\(") & 
    !has_option("fig-alt")
  )

as_document(missing_fig_alt)

 [1] "```{r}"                                                                                     
 [2] "#| label: fig-county-multi-unit-homeownership"                                              
 [3] "#| fig-cap: A scatterplot of homeownership versus the percent of housing units that are"    
 [4] "#|   in multi-unit structures for US counties. The highlighted dot represents Chattahoochee"
 [5] "#|   County, Georgia, which has a multi-unit rate of 39.4% and a homeownership rate of"     
 [6] "#|   31.3%."                                                                                
 [7] "ggplot(county, aes(x = multi_unit, y = homeownership)) +"                                   
 [8] "  geom_point(alpha = 0.3, fill = IMSCOL[\"black\", \"full\"], shape = 21) +"                
 [9] "  labs("                                                                                    
[10] "    x = \"Percent of housing units in that are multi-unit structures\","                    
[11] "    y = \"Homeownership rate\""                                                             
[12] "  ) +"                                                                                      
[13] "  geom_point("                                                                              
[14] "    data = county |> filter(name == \"Chattahoochee County\"),"                             
[15] "    size = 3, stroke = 2, color = IMSCOL[\"red\", \"full\"], shape = 1"                     
[16] "  ) +"                                                                                      
[17] "  geom_text("                                                                               
[18] "    data = county |> filter(name == \"Chattahoochee County\"),"                             
[19] "    label = \"Chattahoochee County\", fontface = \"italic\","                               
[20] "    nudge_x = 21, nudge_y = -5, color = IMSCOL[\"red\", \"full\"]"                          
[21] "  ) +"                                                                                      
[22] "  guides(color = FALSE) +"                                                                  
[23] "  geom_segment("                                                                            
[24] "    data = county |> filter(name == \"Chattahoochee County\"),"                             
[25] "    aes("                                                                                   
[26] "      x = 0, y = homeownership, xend = multi_unit, yend = homeownership,"                   
[27] "      color = IMSCOL[\"red\", \"full\"]"                                                    
[28] "    ), linetype = \"dashed\""                                                               
[29] "  ) +"                                                                                      
[30] "  geom_segment("                                                                            
[31] "    data = county |> filter(name == \"Chattahoochee County\"),"                             
[32] "    aes("                                                                                   
[33] "      x = multi_unit, y = 0, xend = multi_unit, yend = homeownership,"                      
[34] "      color = IMSCOL[\"red\", \"full\"]"                                                    
[35] "    ), linetype = \"dashed\""                                                               
[36] "  ) +"                                                                                      
[37] "  scale_x_continuous(labels = percent_format(scale = 1)) +"                                 
[38] "  scale_y_continuous(labels = percent_format(scale = 1))"                                   
[39] "```"                                                                                        
[40] ""

Illustration of a red star from the cover of The Little Prince.

Cover of the book R for Data Science, 2nd Edition.

Illustration of a red star from the cover of The Little Prince.

Illustration of a gold star from the cover of The Little Prince. leveraging R

Illustration of a gold star from the cover of The Little Prince. GitHub actions

Set global options with `_common.R`

Leverage your R knowledge to achieve consistent output:

_common.R

set.seed(1014)

knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  fig.retina = 2,
  fig.width = 6,
  fig.asp = 2/3,
  fig.show = "hold"
)

options(
  dplyr.print_min = 6,
  dplyr.print_max = 6,
  pillar.max_footer_lines = 2,
  pillar.min_chars = 15,
  stringr.view_n = 6,
  cli.num_colors = 0,
  cli.hyperlink = FALSE,
  pillar.bold = TRUE,
  width = 77 # 80 - 3 for #> comment
)

ggplot2::theme_set(ggplot2::theme_gray(12))

Set status with `_common.R`

Use your R function writing skills to avoid duplication:

_common.R

# use results: "asis" when setting a status for a chapter
status <- function(type) {
  status <- switch(type,
    polishing = "should be readable but is currently undergoing final polishing",
    restructuring = "is undergoing heavy restructuring and may be confusing or incomplete",
    drafting = "is currently a dumping ground for ideas, and we don't recommend reading it",
    complete = "is largely complete and just needs final proof reading",
    stop("Invalid `type`", call. = FALSE)
  )

  class <- switch(type,
    polishing = "note",
    restructuring = "important",
    drafting = "important",
    complete = "note"
  )

  cat(paste0(
    "\n",
    ":::: status\n",
    "::: callout-", class, " \n",
    "You are reading the work-in-progress second edition of R for Data Science. ",
    "This chapter ", status, ". ",
    "You can find the complete first edition at <https://r4ds.had.co.nz>.\n",
    ":::\n",
    "::::\n"
  ))
}

Set status with `_common.R`

Use your R function writing skills to avoid duplication:

EDA.qmd

#| results: "asis"
#| echo: false
source("_common.R")
status("complete")

Today’s solution: `announcement`

🔗 quarto.org/docs/websites/website-tools.html#announcement-bar

_quarto.yml

website:
  announcement: 
    icon: cone-striped
    dismissable: true
    content: |
      "You are reading the work-in-progress second edition of 
      R for Data Science. This chapter **is currently a dumping 
      ground for ideas, and we don't recommend reading it**. 
      You can find the complete first edition at 
      <https://r4ds.had.co.nz>."
    type: primary
    position: below-navbar

Keeping things in check daily

Screenshot of GitHub Action failure email.

Leveraging GitHub actions

Avoid freeze

Set daily checks

.github/workflows /build_book.yaml

on:
  push:
    branches: main
  pull_request:
    branches: main
  schedule:
    # run every day at 11 PM
    - cron: '0 23 * * *'

Whenever faced with a problem, some people say “Let’s use regular expressions.” Now, they have two problems.

Whenever faced with a problem, some people say “Let’s use ~~regular expressions~~GitHub actions.” Now, they have ~~two~~so many more problems.

Don’t reinvent the wheel!

hadley/r4ds > .github/workflows/build-book.yml

quarto-dev/quarto-actions

Screenshot of GitHub action for rendering and deploying R for Data Science book from its GitHub repo.

Screenshot of GitHub action for rendering a Quarto document from the quarto-actions repo.

Illustration of a red star from the cover of The Little Prince.

Mockup of cover of the book Quarto - The Definitive Guide.

Illustration of a red star from the cover of The Little Prince.

Illustration of a gold star from the cover of The Little Prince. multiple languages

Illustration of a gold star from the cover of The Little Prince. multiple environments

Two languages in one `.qmd`

Each being executed with their own engine:

authoring.qmd

## Code cells

::: panel-tabset
### R

{{< embed notebooks/authoring-r.qmd#plot true >}}

### Python

{{< embed notebooks/authoring-python.qmd#plot true >}}
:::

From two source notebooks

notebooks/authoring-r.qmd

---
title: "Authoring - R"
---

## Markdown text

Hello.

## Code cells

```{r}
#| label: add
1 + 1
```

```{r}
#| label: plot
df <- data.frame(x = 1:8, y = 3:10)
m <- lm(y ~ x, data = df)
plot(df$x, df$y)
abline(m)
```

notebooks/authoring-py.qmd

---
title: Authoring - Python
---

## Markdown text

Hello.

## Code cells

```{python}
#| label: add
1 + 1
```

```{python}
#| label: plot
import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 8])
ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints)
plt.show()
```

Two recognizable outputs on a single page

GIF of going between tabs of output that is the result of the code in the previous slide. One tab contains a plot made with R and the other with Python.

Productivity with `freeze`

_quarto.yml

execute:
  freeze: auto

ProductivitySafeguarding your sanity with `freeze`

_quarto.yml

execute:
  freeze: auto

“Making” books,
that are not just pretty,
but also functional…

`r-wasm/quarto-live`

thank you!

🔗 bit.ly/books-conf24

mine-cetinkaya-rundel/quarto-books-conf24

Reproducible,dynamic,and elegantbooks with Quarto

“Making” books

The books

Two outputs

From one source

With the help of meticulous styling

With the help of meticulous styling

_quarto.yml

Two outputs

Three outputs

With even more meticulous styling

Unfortunately, it’s not all magic…

Unfortunately, it’s not all magic…

Unfortunately, it’s not all magic…

Unfortunately, it’s not all magic…

Bring back the magic

In three components

Looking forward to typst for styling

Looking forward to typst for tables

Looking forward to typst for tables

Accessibility: fig-alt

Do all my figures have fig-alts?

Do all my figures have fig-alts?

Checking for missing fig-alts

Checking for missing fig-alts

Checking for missing fig-alts

Checking for missing fig-alts

Set global options with _common.R

Set status with _common.R

Set status with _common.R

Today’s solution: announcement

Keeping things in check daily

Leveraging GitHub actions

Don’t reinvent the wheel!

Two languages in one .qmd

From two source notebooks

Two recognizable outputs on a single page

Productivity with freeze

ProductivitySafeguarding your sanity with freeze

“Making” books,that are not just pretty,but also functional…

r-wasm/quarto-live

Reproducible,
dynamic,
and elegant
books with Quarto

`_quarto.yml`

Looking forward to `typst` for styling

Looking forward to `typst` for tables

Looking forward to `typst` for tables

Accessibility: `fig-alt`

Do all my figures have `fig-alt`s?

Do all my figures have `fig-alt`s?

Checking for missing `fig-alt`s

Checking for missing `fig-alt`s

Checking for missing `fig-alt`s

Checking for missing `fig-alt`s

Set global options with `_common.R`

Set status with `_common.R`

Set status with `_common.R`

Today’s solution: `announcement`

Two languages in one `.qmd`

Productivity with `freeze`

ProductivitySafeguarding your sanity with `freeze`

“Making” books,
that are not just pretty,
but also functional…

`r-wasm/quarto-live`