class: center, middle, inverse, title-slide # Graphics with ggplot2 📊 ## NHS R Conference 2020 ### John McIntyre --- layout:true layout: true <div class="jr-header"> <img class="logo" src="assets/white_logo_full.png"/> <span class="social"><table><tr><td><img src="assets/twitter.gif"/></td><td> @jumping_uk</td></tr></table></span> </div> <div class="jr-footer"><span>© 2021 Jumping Rivers (jumpingrivers.com)</span><div>http://bit.ly/nhs-ggplot2</div></div> --- # Introduction to {ggplot2} * http://bit.ly/nhs-ggplot2 .center[ <img src="assets/graphics.jpg" width = 400></img> ] --- layout:true layout: true <div class="jr-header"> <img class="logo" src="assets/white_logo_full.png"/> <span class="social"><table><tr><td><img src="assets/twitter.gif"/></td><td> @jumping_uk</td></tr></table></span> </div> <div class="jr-footer"><span>© 2021 Jumping Rivers (jumpingrivers.com)</span><div>http://bit.ly/nhs-ggplot2</div></div> --- background-image: url(assets/white_logo.png) class: center, middle, inverse # Who am I? --- layout: true layout: true <div class="jr-header"> <img class="logo" src="assets/white_logo_full.png"/> <span class="social"><table><tr><td><img src="assets/twitter.gif"/></td><td> @jumping_uk</td></tr></table></span> </div> <div class="jr-footer"><span>© 2021 Jumping Rivers (jumpingrivers.com)</span><div>http://bit.ly/nhs-ggplot2</div></div> --- # Jumping Rivers .pull-left[ <img src="assets/robot.jpg"></img> ] .pull-right[ * On-site training * R and python consultancy * Dashboard creation * Code review * Questionnaire design * R Package development * Predictive analytics * Grant applications ] --- # Our clients <div id="clients"> <img src="assets/shell.png"></img><img src="assets/sustrans.png"></img> <img src="assets/yorkshire.png"></img><img src="assets/hastings.png"></img> <img src="assets/Pragmatic.png"></img><img src="assets/nhs.png"></img> <img src="assets/royal_statistical_society.jpg"></img><img src="assets/Francis_Crick_Institute.png"></img> <img src="assets/Ministry_of_Defence.png"></img><img src="assets/University_of_Manchester.png"></img> </div> --- # Introduction * _Many_ different ways to make graphs in R * {ggplot2} started in 2005 and follows the "Grammar of Graphics" * Many companies have adopted {ggplot2} for graphics, including the [BBC](https://bbc.github.io/rcookbook/) and the [FT](https://twitter.com/jburnmurdoch) * Think about graphics in terms of [layers](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html) .center[ <img src="assets/canvas.jpg" width=400></img> ] --- # The basic plot object * Load the package with ```r library("ggplot2") ``` * Create an initial ggplot object, using `ggplot()` * This function has two arguments: * __data__: this must be a data frame (or tibble) * an aesthetic __mapping__: this tells {ggplot2} how to map data to the graphical elements --- # Setting up the plot ```r movies = readRDS("data/movies.rds") ``` * The function `aes()` maps our data to the graph * Here, duration is mapped to the x-axis ```r g = ggplot(data = movies, mapping = aes(x = duration, y = rating)) ``` Notice we can store graphs in variables. --- # Scatter plots * To add information we need to add a `geom`. * Can have multiple `geoms` on a graph ```r h = ggplot(movies, aes(x = duration, y = rating)) h + geom_point() ``` <img src="slides_files/figure-html/unnamed-chunk-4-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Bar charts * Great for displaying qualitative data. * Length of the bar represents the frequency ```r ggplot(movies, aes(x = classification)) + geom_bar() ``` <img src="slides_files/figure-html/unnamed-chunk-5-1.svg" width="70%" style="display: block; margin: auto;" /> --- class: center, middle, inverse background-image: url(assets/white_logo.png) background-size: cover # Quick Quiz! --- # Spot the mistakes Sarah is trying to recreate the following plot, but her code doesn't work <img src="slides_files/figure-html/unnamed-chunk-6-1.svg" width="50%" style="display: block; margin: auto;" /> Spot three errors in her code and think about how you would fix them. ```r ggplot(data = movies, x = votes, y = rating) %>% geom_scatter() ``` --- # Histograms * Good for plotting continuous variables. * Data is split up into intervals called `classes`. * Area of the columns represent the frequencies in the classes. ```r ggplot(movies, aes(x = duration)) + geom_histogram(binwidth = 10) ``` <img src="slides_files/figure-html/unnamed-chunk-8-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Box and whisker plots * Central bar represents the median * Top and bottom of the box represents the lower and upper quartiles ```r boxplot = ggplot(movies, aes(x = classification, y = rating)) + geom_boxplot() ``` <img src="slides_files/figure-html/unnamed-chunk-10-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Titles and legends ```r boxplot + labs(x = "Movie classification", y = "User rating (1-10)", title = "Movie ratings conditional on classification", subtitle = "Data collected from the IMDB", caption = "These box plots represent etc etc") ``` <img src="slides_files/figure-html/unnamed-chunk-11-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Saving graphs to files * So how do you save all your fantastic plots? * RStudio has the *export* button * It's also possible to export using code (better for reproducibility) --- layout: true layout: true <div class="jr-header-inverse"> <img class="logo" src="assets/white_logo_full.png"/> <span class="social"><table><tr><td><img src="assets/twitter.gif"/></td><td> @jumping_uk</td></tr></table></span> </div> <div class="jr-footer-inverse"><span>© 2021 Jumping Rivers (jumpingrivers.com)</span><div>http://bit.ly/nhs-ggplot2</div></div> --- class: center, middle, inverse background-image: url(assets/white_logo.png) background-size: cover # Practical --- layout: true layout: true <div class="jr-header"> <img class="logo" src="assets/white_logo_full.png"/> <span class="social"><table><tr><td><img src="assets/twitter.gif"/></td><td> @jumping_uk</td></tr></table></span> </div> <div class="jr-footer"><span>© 2021 Jumping Rivers (jumpingrivers.com)</span><div>http://bit.ly/nhs-ggplot2</div></div> --- # Aesthetics There are many different types of aesthetics * Colour/fill 🎨 * Shape 🔺 * Size 🐘 * Linetype ✏️ * Alpha 👻 --- # Colours ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(aes(colour = duration)) ``` <img src="slides_files/figure-html/unnamed-chunk-12-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Colours ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(aes(colour = classification)) ``` <img src="slides_files/figure-html/unnamed-chunk-13-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Colours ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(aes(colour = "Blue")) ``` <img src="slides_files/figure-html/unnamed-chunk-14-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Colours ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(colour = "Blue") ``` <img src="slides_files/figure-html/unnamed-chunk-15-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Remember! ⚠️ ## Inside aes(): Using columns of the data to style. E.g. `aes(colour = classification)` ## Outside aes(): Fixed styling. E.g. `colour = "blue"` --- # Shapes <img src="slides_files/figure-html/shapes-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Change shape for *all* points ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(shape = 9, size = 3) ``` <img src="slides_files/figure-html/unnamed-chunk-16-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Change shape based on the data ```r ggplot(movies, aes(x = duration, y = rating)) + * geom_point(aes(shape = classification), size = 3) ``` <img src="slides_files/figure-html/unnamed-chunk-17-1.svg" width="70%" style="display: block; margin: auto;" /> --- class: center, middle, inverse background-image: url(assets/white_logo.png) background-size: cover # Quick Quiz! --- # Fill in the blanks ```r ggplot(movies, aes(x = year, y = rating)) + geom_point(aes(___ = classification, size = ___), ___ = 0.3) ``` <img src="slides_files/figure-html/unnamed-chunk-19-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Facets - mini plots! ```r ggplot(movies, aes(x = duration)) + geom_histogram(binwidth = 10) + facet_wrap(~classification) ``` <img src="slides_files/figure-html/unnamed-chunk-20-1.svg" width="90%" style="display: block; margin: auto;" /> --- # Themes ``` #> Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = #> "none")` instead. ``` * Quick and easy way to add style to your plots. * Eight in-built themes with {ggplot2} ```r ggplot(bond, aes(Kills, Alcohol_Units)) + geom_point(aes(colour = Actor)) + theme_bw() ``` <img src="slides_files/figure-html/unnamed-chunk-22-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Other themes: {ggthemes} ```r library("ggthemes") g + theme_excel() + scale_colour_excel() g + theme_minimal() + scale_color_ptol() g + theme_hc() + scale_colour_hc() ``` <img src="slides_files/figure-html/unnamed-chunk-24-1.svg" width="70%" style="display: block; margin: auto;" /> --- # Other themes: {hrbrthemes} See https://github.com/hrbrmstr/hrbrthemes <img src="slides_files/figure-html/unnamed-chunk-26-1.svg" width="100%" style="display: block; margin: auto;" /> --- # Other themes: {tvthemes} You can even style your plot based on your favourite TV show with [{tvthemes}](https://ryo-n7.github.io/2019-05-16-introducing-tvthemes-package/) <center> <img src="assets/simpsons.png" width = 600></img> </center> --- class: center, middle, inverse background-image: url(assets/white_logo.png) background-size: cover # Practical --- # Top troubleshooting tips 💁 * {ggplot2} is the 📦, `ggplot()` is the function * `colour = "blue"` vs `aes(colour = gender)` * Round brackets! Geoms are functions. E.g. `geom_point()` * Remember the `+` * Are you plotting something sensible (is your data categorical?) * Try manipulating your data **before** plotting it * 📊 Just `x` - `geom_bar()`, `x` and `y` - `geom_col()` --- # Links & resources 📚 * [R4DS book chapter 3](https://r4ds.had.co.nz/data-visualisation.html) * [R graphics cookbook](http://www.cookbook-r.com/Graphs/) * [RStudio {ggplot2} cheatsheet](https://rstudio.com/resources/cheatsheets/) * [#TidyTuesday](https://github.com/rfordatascience/tidytuesday) * [Our training courses](https://www.jumpingrivers.com/training/all-courses/) Share any cool plots you make on twitter! @jumping_uk #NHSR #NHSRconf2021 <center> <img src="assets/graphics.jpg" width = 300></img> </center>