Manuscripts

Reproducible publishing with Quarto

Dr. Mine Çetinkaya-Rundel

Duke University

Posit, PBC

Dr. Charlotte Wickham

Posit, PBC

2024-08-04

Full complexity spectrum of reproducible scientific projects

Simplest

Can run all code in a single file, and don’t mind running it over and over again with each edit.



e.g. Data Science 101 - HW 1, Stat 101 - Final project, a blog post, a tutorial, a not-too-extensive consulting report, etc.

Simple

Can run all code in a single file, and don’t mind running it over and over again with each edit, and need an output that conforms to journal style.



or

formatted with journal style

e.g., a not-too-computational journal article.

but science is rarely simple…

  • multiple collaborators, each with their favorite computing language and code editor
  • multiple stages of a project, each with their own level of feasibility of what can be re-run with each edit and what needs to be cached

More complex



or

Even more complex



or

Leveraging Quarto for fully reproducible scientific manuscripts

Aside: What is in a notebook?

A notebook is a document that contains both code and narrative:

  • Jupyter notebooks (.ipynb)
  • Quarto documents (.qmd) – a potential mindshift

Current state of affairs

Most computational science is born in notebooks

  • Peer-review and publication workflows don’t support notebooks as research outputs
  • The more complex scenarios involve a lot of manual finagling to bring the project to journal submission stage
  • Often during this process reproducibility is lost, or takes second seat to the formatting requirements
  • Final submission rarely captures all computations, which are, at best, relegated to supplementary materials

and dies ends in PDF or Word documents

Roadmap to fully reproducible scientific manuscripts

that are not just PDFs that are the outputs of a single qmd file

An end-to-end scholarly publishing workflow that treats Jupyter and Quarto notebooks as a primary element of the scientific record.

A publication process that elevates transparent and reproducible work by authors, where data and software, together with narrative, are documented, shared, and archived.

New forms of credit to the wider research community, including research software engineers or research software engineers.

Journal articles

aka “Yes, you can write a JASA article with Quarto”

quarto use template quarto-journals/jasa

A tour of journal articles with Quarto

Sit back and enjoy the first part,
your turn activity coming soon!

  • Create a journal article with JASA template
  • Add executable cells
  • Review cross referencing
  • Add a citation from a DOI

Warning

The project will be available to you to continue at the end of the tour.

Your turn

Option 1: Start the project 4-articles.

Option 2: Launch the project in 4-articles, then go to Terminal and run quarto use template quarto-journals/jasa.


  • Add another cross-referencable code cell and cross reference it.
  • Add another citation from a DOI.
10:00

Learn more

Journal articles with type: manuscript

Quarto manuscript

Quarto manuscripts, in addition to doing everything you can do with journal articles, can

  • produce manuscripts in multiple formats (including LaTeX or MS Word formats required by journals), and give readers easy access to all of the formats through a website

  • publish computations from one or more notebooks alongside the manuscript, allowing readers to dive into your code and view it or interact with it in a virtual environment

Getting started

Manuscripts ♥️ Git + GitHub

Track your project with Git and host on GitHub for easy publishing.

A finished product

Multiple formats from one source

Multiple formats from one source

In quarto.yml of the project:

---
format:
  html:
    theme: cosmo
    toc-location: left
    comments: 
      hypothesis: true
    citations-hover: true
    crossrefs-hover: true
  agu-pdf: default
  docx: default
  jats: default
---

Rich front matter

In index.qmd of the project:

---
title: La Palma Earthquakes
author:
  - name: Steve Purves
    orcid: 0000-0002-0760-5497
    corresponding: true
    email: steve@curvenote.com
    roles:
      - Investigation
      - Project administration
      - Software
      - Visualization
    affiliations:
      - Curvenote
  - name: Rowan Cockett
    orcid: 0000-0002-7859-8394
    corresponding: false
    roles: []
    affiliations:
      - Curvenote
license: CC BY-SA 4.0
keywords:
  - La Palma
  - Earthquakes
date: '2022-05-11'
abstract: |
  In September 2021, a significant jump in seismic activity on the island of La Palma (Canary Islands, Spain) signaled the start of a volcanic crisis that still continues at the time of writing. Earthquake data is continually collected and published by the Instituto Geográphico Nacional (IGN). We have created an accessible dataset from this and completed preliminary data analysis which shows seismicity originating at two distinct depths, consistent with the model of a two reservoir system feeding the currently very active volcano.
keypoints:
  - You may specify 1 to 3 keypoints for this PDF template
  - These keypoints are complete sentences and less than or equal to 140 characters
  - 'They are specific to this PDF template, so they will not appear in other exports'
citation:
  container-title: Notebooks Now!
draft: false
bibliography: references.bib
echo: false
---

Rich front matter

from source \(\rightarrow\) only relevant / required metadata in manuscript:

Rich front matter

from source \(\rightarrow\) only relevant / required metadata in manuscript:

Embedded computations

  • Perform computation in a labelled code cell in a notebook, in any language

  • Embed results of the computation with a link to the notebook with

{{< embed name-of-notebook.qmd#fig-cell-label >}}
{{< embed name-of-notebook.ipynd#tbl-cell-label >}}


See example at https://github.com/quarto-ext/manuscript-template-vscode/blob/main/index.qmd.

What’s next?

Actually dive into the code

  • We’ve seen that you can peruse the code underlying the figures and tables in the manuscript

  • What if you wanted to interact with the code – in a computational environment that’s just a click away and that has all the software and packages needed to reproduce the manuscript?

Binder with Quarto

with quarto use binder:

Binder with Quarto

Learn more

Questions

Any questions / anything you’d like to review before we wrap up this module?

Parting remarks

Learning more

https://quarto.org

Follow up with…

the Quarto Blog: https://quarto.org/docs/blog

Thank you!

🐘 https://fosstodon.org/@minecr

☁️ @minecr.bsky.social

Parting remarks

Quarto CLI…

orchestrates each step of rendering

A schematic representing rendering of Quarto documents from .qmd, to knitr or jupyter, to plain text markdown, then converted by pandoc into any number of output types including html, PDF, or Word document.

Artwork from “Hello, Quarto” keynote by Julia Lowndes and Mine Çetinkaya-Rundel, presented at RStudio Conference 2022. Illustrated by Allison Horst.

Learning more

https://quarto.org

Follow up with…

the Quarto Blog: https://quarto.org/docs/blog

Thank you!

🐘 https://fosstodon.org/@minecr

🐦 @minebocek