Bilingual Materials with Quarto

Dr. Kelly Bodwin

California Polytechnic State University

2023-08-08

How does Quarto make python easier?

Engine Setup

In R Markdown, to use python, you had to set everything up in R using the reticulate package:

library(reticulate)

library(reticulate)
py_config()
use_python("/usr/local/bin/python3")
py_discover_config()

library(reticulate)
py_config()
use_python("/usr/local/bin/python3")
py_discover_config()

## You may need the code below if you are working on your home Windows computer:
py_run_string("import os as os")
py_run_string("os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = 'C:/Users/kbodwin/Anaconda3/Library/plugins/platforms'")

library(reticulate)
py_config()
use_python("/usr/local/bin/python3")
py_discover_config()

## You may need the code below if you are working on your home Windows computer:
py_run_string("import os as os")
py_run_string("os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = 'C:/Users/kbodwin/Anaconda3/Library/plugins/platforms'")

reticulate::py_install("pandas")
reticulate::py_install("scikit-learn")
reticulate::py_install("scipy")

Engine Setup

Now, it’s all “under the hood”:

---
title: "My document"
jupyter: python3
---

Tricks for interweaving R and python

Choose your install in Global Options

Object passing

  • Automatic variables named py and r

  • Access python objects from R with py$

  • Access R objects from python with r.

Object passing


```{r}
# read and filter data with R
library(palmerpenguins)

adelie <-
  penguins %>%
  filter(species == "Adelie")
```

```{python}
# model with python
LR = LinearRegression()

X = r.adelie[['bill_length_mm', 'bill_depth_mm']]
y = r.adelie['mass_g']

mod = LR.fit(X, y)
coef = mod.coef_
```

```{r}
# summarize with R
py$coef
```

Jupyter

  • Do you like Jupyter notebooks? Me too!

  • Do you like Colab and/or Posit Cloud? Me too!

  • Do you want to make your materials in Quarto but distribute to students in Jupyter form? Me too!

Code
quarto convert class_notes.qmd
quarto convert class_notes.ipynb 

quarto render class_notes.ipynb

Remaining challenges of multilingual docs

  1. Most undergrad programs don’t require multiple languages.

  2. Detecting python installs and packages isn’t perfect.

  3. R and python do not always have the same default implementations for models.

  4. What if what you want is not a multilingual document, but duplicated translated documents?

versions() functions in templar mini-package

What it does

Upon rendering…

  1. Identifies the R chunks from the python chunks

  2. Identifies the labeled text sections for R or python.

  3. Writes new unilingual source files (qmds) in separate folders

  4. (Optionally) Renders those source files

  5. (Optionally) Converts the python one to ipynb.

Example

First chunk

```{r, version = "none"}
templar::versions_quarto_multilingual(global_eval = FALSE,
                                      to_jupyter = TRUE, 
                                      warn_edit = FALSE)
```
  • {version = "none"}: this code itself should not appear in generated files

  • global_eval: do you want to

  • to_jupyter: do you want to convert the python version?

  • warn_edit: put a warning at the top of generated files to not edit directly?

Code chunk versions


The first thing we need to do with any dataset is check for
missing data, and make sure the variables are the right type:

```{r, version = "R"}
df %>% summary()
```

```{python, version = "python"}
df.info()
```

Versioned text


The next step is to clean the data.

%%% version: R

The "Color" variable is being treated as a "character" variable,
i.e., just an ordinary word.

But we know it should be *categorical*, aka a `factor` variable.
Let's `mutate` the dataset to fix this:

```{r, version = "R"}
df <- df %>%
  mutate(
    Color = factor(Color)
  )

df %>% summary()
```

%%%

%%% version: python

The "Color" variable is being treated as an "object" variable,
i.e., just an ordinary word.

But we know it should be *categorical*, aka a `category` variable.
Let's *retype* the variable to fix this:

%%%

```{python, version = "python"}
kelly_df['Gender'] = kelly_df['Gender'].astype('category')
kelly_df.info()
```

Why it’s useful

  • Keep the content creation all in one place - more reproducible!

  • Elements of the document can be shared - e.g. data descriptions, question prompts, css chunks

  • Give you source files to disseminate to students, not just rendered files.

  • (Also lets you do student and solution version, or Exam A and Exam B, etc.)

What’s next for templar and Quarto?

Lua translation

  • Parse the source file text more elegantly

  • Match syntax to Quarto’s

  • (::: instead of %%%)

  • (#| version: R instad of {version = "R})

  • Do you know Lua and knitr? Let’s be friends.

Quarto’s multi-output trick

What it can do now:

---
title: "My Versioned Doc"
format: 
  html:
    output-file: my_versioned_doc.html
  pdf:
    output-file: my_versioned_doc.pdf
---

::: {.content-visible when-format="html"}
Click here to expand!
:::

Quarto’s multi-output trick

What I want:

---
title: "My Versioned Doc"
format: 
  html+R:
    output-file: handout-r.html
  html+python:
    output-file: handout-python.html
---

::: {.content-visible when-variant="R"}
The function `mutate()` will add a new variable.
:::

Resources

Install the templar package:

remotes::install_github("r-for-educators/templar")

Co-author: Ian Flores Siaca

Contribute at www.github.com/r-for-educators/templar

Examples of course materials generated with templar:

https://github.com/Statistical-Learning-with-R

Thank You!

Find me:

Email: kbodwin@calpoly.edu

Website: www.kelly-bodwin.com

GitHub: kbodwin

Mastodon: kellybodwin@mastodon.social

Threads: @kellbod