Reproducible data reports with Quarto

Rhian Davies & Nicola Rennie

Getting set up

We’ll get started when the countdown finishes.

05:00

Introduction to Quarto

Quarto let’s us write documents that combine code with text. It works for multiple languages and multiple output formats. Let’s explore the Quarto gallery.

What about R Markdown?

R Markdown isn’t going anywhere but…

  • Quarto has better multi-language support
  • More user-friendly
  • Better control of the output layouts

It’s not an R package but…

library(quarto)
quarto_render("document.qmd")

Creating a new Quarto document

  • File > New File > Quarto Document

  • Set title and author

  • Click Create

  • Save and click Render

YAML header

---
title: "A very cool title"
format: html
---

Content

  • Text
  • Links
  • Images
  • Code
  • Embedded tables and plots
  • Equations
  • References

Then click the Render button.

Including images

There are different ways to add images to a Quarto document. The easiest way is to use the Visual editor - click Insert –> Figure/Image

Brown-collared lemur looking directly at the camera

Image from Duke Lemur Center

Task 1: Your first Quarto document

  • File > New File > Quarto Document

  • Set title and author

  • Click Create

  • Save and click Render

  • Add the text from task1.txt

  • Add a link to the Duke Lemur Center https://lemur.duke.edu/

  • Add an image of the Mongoose Lemur

15:00

Code Chunks

… and lemurs…

Loading data

You can load in your data:

```{r}
#| label: load_data
#| message: false
#| cache: true
lemurs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-24/lemur_data.csv')
```

Chunk Options

  • Hide the code
#| echo: false
  • Show the code
#| echo: true
  • Show the code and the YAML
#| echo: fenced

Code collapsing

We can also load in some R packages. But maybe, we collapse the code…

Code
```{r}
#| label: pkgs
#| message: false
#| code-fold: true
library(dplyr)
library(tidyr)
library(stringr)
library(ggplot2)
library(gghighlight)
```

… do some data wrangling, and print the output as a table…

```{r}
#| label: wrangling
#| output-location: slide
df = lemurs %>% 
  filter(taxon == "ECOL", 
         age_category == "young_adult") %>% 
  select(name, weight_g, sex, age_at_wt_mo) %>% 
  filter(sex %in% c("M", "F")) %>% 
  drop_na()

df %>% 
  head() %>% 
  knitr::kable()
```
name weight_g sex age_at_wt_mo
NICOLE 2170.0 F 31.33
ALAIN 2502.0 M 22.98
ALAIN 1637.5 M 30.58
ALAIN 2243.0 M 37.22
GASTON 1916.0 M 37.87
GASTON 2147.5 M 38.30

…or include some exploratory plots!

```{r}
#| label: plots
#| message: false
#| output-location: slide
#| code-overflow: wrap
#| fig-cap: "Age vs weight of Young Adult Collared Brown Lemurs"
#| fig-alt: "Plot of age vs weight of young adult collared brown lemurs, split by male and female showing positive relationship for both"
ggplot(data = df,
       mapping = aes(x = age_at_wt_mo,
                     y = weight_g,
                     colour = sex)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(title = "Young Adult Collared Brown Lemurs", 
       x = "Age (months)", 
       y = "Weight (g)") +
  theme_minimal() 
```

Plot of age vs weight of young adult collared brown lemurs, split by male and female showing positive relationship for both

Age vs weight of Young Adult Collared Brown Lemurs

Inline code

```{r}
#| label: biggest_lemur
big_lemur = lemurs %>% 
  slice_max(weight_g, n = 1) 
```

We can also include code inline, rather than as a separate chunk.

The largest lemur recorded is called `r str_to_title(big_lemur$name)`, who weighed `r format(big_lemur$weight_g, scientific = FALSE)` grams.

The largest lemur recorded is called Sabina, who weighed 10337 grams.

Task 2: Adding code

  • Try to render the document, what are the problems?

  • Add a new code chunk to load the libraries {ggplot2}, {dplyr} and {tidyr}

  • Set the code chunk options for your new chunk to the hide the code and messages

  • Add a caption to the plot with the fig-cap code chunk option

  • Add a code chunk to fit a linear model fit = lm(weight_g ~ age_at_wt_mo, data = df)

  • Use inline code chunks to give the model fit coefficients in the sentence. Hint: fit$coefficients

15:00

Equations, journals, referencing and more…

Equations

We can use LaTeX to add equations.

\begin{equation}
\hat{e}_i = Y_i - \hat{Y}_i
\end{equation}

\[\begin{equation} \hat{e}_i = Y_i - \hat{Y}_i \end{equation}\]

Styling

Correctly formatted journal articles in 30 seconds?

Try quarto-journals e.g. for a Journal of Statistical Software article:

----
title: "My Document"
format:
  pdf: default
  jss-pdf:
    keep-tex: true   
---

Referencing

Create a reference file:

@article{Andriambeloson2020,
  title={Prolonged torpor in Goodman’s mouse lemurs (Microcebus lehilahytsara) from the high-altitude forest of Tsinjoarivo, central-eastern Madagascar.},
  author={Andriambeloson JB, Greene LK, Blanco MB},
  journal={Folia Primatologica},
  volume={91},
  pages={697–710},
  year={2020}
}

Add a link to your reference file in your .qmd document YAML:

bibliography: references.bib

Then link the reference using an @.

To cite a paper about lemurs [@Andriambeloson2020]

Task 3: Getting paper ready

  • Open task-3.qmd

  • Add the equation for linear regression y = mx + c in

  • Link to the reference.bib file in your .qmd document YAML header

  • Fix the reference link using an @.

15:00