Automated reporting with Quarto

Parisa Gregg & Myles Mitchell

Welcome

Wifi

  • TusPark Guest Wifi

Training Environment details

Slides

Us

Jumping Rivers

  • Data science consultancy
    • Python / R, machine learning, dashboards, API’s
  • Data engineering
    • Data pipelines, server health and security, managed Posit (RStudio) services
  • Training
    • Python, R, Git, Tableau + many more
  • Community
    • Conferences/meetups, blogs, open-source

The Plan


  • What is Quarto?
  • Creating a document in VSCode
  • Markdown basics
  • Executable code chunks
  • Graphs & tables

Takeaways

  • At the end you’ll have an example of a reproducible html document created with Quarto

  • Slides are available on GitHub

What is Quarto?

What is Quarto?

  • Tool created by RStudio (Posit)

  • Create documents that combine code with text

  • Multiple languages and output formats

  • Evolution of R Markdown

  • File extension .qmd

What is Quarto?

  • Publishing system built on Pandoc

Where to find out more

Why use Quarto

  • Create dynamic content
  • Publish beautifully formatted articles, reports, presentations, websites, blogs and books
  • Author with scientific markdown
  • Generate reports directly from notebooks

Why use Quarto

  • Multi-language support
    • Python, Julia, R and Observable JS
  • Can be used in:
    • VS Code
    • Jupyter notebooks
    • RStudio IDE
    • Text editors / terminal

Creating a Quarto document

Quarto extension - VS Code

  • Settings > Extensions

  • Search “quarto” in extensions search bar

  • Click the Quarto extension

  • Click “Install in neds-quarto…”

Python extension

  • Settings > Extensions

  • Search “python”

  • Click “Python” extension (make sure it is the ms-python one)

Set Python interpreter

  • Settings > Command Palette

  • Search “Python: Select interpreter”

  • Select “Enter interpreter path…” and type “/opt/python/3.11.3/bin/python”

Install dependencies

python3 -m pip install jupyter pandas plotly tabulate

Creating a new document - VS Code

  • File > New File… > Quarto Document (qmd)

  • Set title and output format

  • Click Preview (or type Ctrl+Shift+K)

Creating a new document - VS Code

VS Code screenshot

Rendering with the command line

  • Terminal > New Terminal

  • Preview your document:

quarto preview my_doc.qmd
  • Render your document:
quarto render my_doc.qmd

YAML header

---
title: "A very cool title"
author: "Me"
date: 2023-09-28
format: html
jupyter: python3
---
  • Output format includes html, pdf, docx
  • Use jupyter option to select the Jupyter kernel

Note

YAML: Yet Another Markup Language

Task 1: Your first Quarto document! (5 mins)

  1. File > New File… > Quarto Document (qmd)

  2. Add a YAML declaring a title, author and HTML output

---
title: "A very cool title"
author: "Me"
format: html
---
  1. Click Render (or type Ctrl+Shift+K)

Markdown basics

Including content

  • Text
    • Font
    • Lists
    • Headings
  • Links
  • Images
  • Code
  • Embedded tables and plots
  • Equations
  • References

Quarto documentation

Font

Markdown Output
**bold** bold
__bold__ bold
*italic* italic
_italic_ italic
~~strikethrough~~ strikethrough
^superscript^ superscript
~subscript~ subscript

Bullet points (use -, + or *)

- Banana
- Apple
    - Pink lady
    + Royal gala
    * Granny Smith
- Pear

Numbered lists

1. Banana
1. Apple
    - Pink lady
    - Royal gala
    - Granny Smith
1. Pear

Headings

# Heading level 1
## Heading level 2
### Heading level 3
#### Heading level 4
##### Heading level 5
###### Heading level 6

Task 2: Adelie Penguins 🐧 (10 mins)

  1. Add the text from task02.txt to your Quarto doc

  2. Match the formatting (italics, bold, links) of the first sentence in the Adelie Penguin wiki

  3. Add an image of the Adelie Penguin (there’s one in the exercises folder)

  4. Can you add the penguin emoji to your text?

Including code

Code chunks

  • You can evaluate code!
  • Not limited in what code you can run
  • We could load data…
```{python}
import pandas as pd

mario_kart = pd.read_csv(
    "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-05-25/records.csv"
)
```

Code chunks

  • Can have as many chunks as you like
```{python}
mario_kart.head(2)
```
track type shortcut player system_played date time_period time record_duration
0 Luigi Raceway Three Lap No Salam NTSC 1997-02-15 2M 12.99S 132.99 1
1 Luigi Raceway Three Lap No Booth NTSC 1997-02-16 2M 9.99S 129.99 0

Code chunks

  • Order matters!
x
NameError: name 'x' is not defined
x = 5
x
5

Chunk options

  • Control properties of the code within the chunk and it’s outputs

  • Controlled using YAML within the code chunks

  • Loads of options

Chunk options

Option Purpose Default value
echo Show/hide code chunks in the output true
eval Whether to evaluate code within the chunk true
warning Show/hide messages/warnings produced by code in the output true
error Allow the code to error but show the error in the output false

Chunk options: echo

Use to show / hide code

#| echo: false
mario_kart.head(2)

produces:

track type shortcut player system_played date time_period time record_duration
0 Luigi Raceway Three Lap No Salam NTSC 1997-02-15 2M 12.99S 132.99 1
1 Luigi Raceway Three Lap No Booth NTSC 1997-02-16 2M 9.99S 129.99 0

Code collapsing

#| echo: fenced
#| code-fold: true
Code
```{python}
#| code-fold: true
import numpy as np
import pandas as pd
```

Task 3: Running some code (10 mins)

  1. Open the document task03.qmd

  2. Under the analysis subheading, add a code chunk to import Pandas

  3. Hide the code chunk with #| echo: false

  4. Add another code chunk to read in the penguins data (link in doc) and display it i.e.

    penguins = pd.read_csv("link_to_penguin_data")
  5. Make the data loading chunk a collapsed code chunk with #| echo: fenced and #| code-fold: true

Graphs & tables

Graphs

```{python}
#| eval: false
import plotly.express as px

# Filter the data
rainbow_road = mario_kart.loc[
        (mario_kart["track"] == "Rainbow Road") &
        (mario_kart["type"] == "Three Lap")
].reset_index()

# Plot the data
px.line(
    rainbow_road,
    x="date",
    y="time",
    color="shortcut",
    title="Progress of Rainbow Road N64 World Records",
    line_shape="hv",
    markers="."
)
```

Graphs

ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed

Tables

  • Markdown syntax
| fruit  | count  | color  |
|--------|--------|--------|
| banana | 5      | yellow |
| apple  | 6      | red    |
| pear   | 2      | green  |
fruit count color
banana 5 yellow
apple 6 red
pear 2 green

Tables

  • Convert python data to markdown
```{python}
#| eval: false
#| tbl-cap: "Table of fruits"
from IPython.display import Markdown
from tabulate import tabulate

table = [
    ["banana", 5, "yellow"],
    ["apple", 6, "red"],
    ["pear", 2, "green"],
]
Markdown(
    tabulate(table, headers=["fruit", "count", "color"])
)
```

Tables

Table of fruits
fruit count color
banana 5 yellow
apple 6 red
pear 2 green

Task 4: Graphs and tables (10mins)

  1. Open the document task04.qmd

  2. Add a plot showing the distribution of bill length for each sex and species. Use the plotting code below:

px.histogram(
    penguins,
    x="bill_length_mm",
    color="sex",
    facet_row="species",
)
  1. Add a caption to the plot using the fig-cap code chunk option.