class: inverse, center, middle background-image: url("img/ggplot2.png") background-position: 95% 95% background-size: 25% # Data vizualisation using _ggplot2_ ### Maximilian H.K. Hesselbarth #### University of Michigan (EEB) 2022/10/24 --- background-image: url("img/masterpiece.png") background-position: 95% 5% background-size: 25% # Grammar of Graphics -- - Based on Grammar of Graphics .ref[Wilkinson, L., 2012. The grammar of graphics. In Handbook of computational statistics (pp. 375-414). Springer, Berlin, Heidelberg.] -- - Elements of all _ggplots_: 1. **data** Information to visualize 2. **aesthetic** Mappings of data to plot coordinates (position, color, size, ...) 3. **geometry** Shapes to represent the data (points, lines, polygons, ...) .ref[Wickham, H., 2016. ggplot2: Elegant graphics for data analysis. Springer, New York, USA.] .ref[Illustration from the Openscapes blog Tidy Data for reproducibility, efficiency, and collaboration by Julia Lowndes and Allison Horst] --- # Data and mapping layer -- .pull-left[ - Map body mass to **x** axis and bill length to **y** axis ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Geometry layer -- .pull-left[ - Adding a geometric **point** layer ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + * geom_point() ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-4-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Advanced mapping layer -- .pull-left[ - Add *color* mapping and change **size** and **shape** ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + * geom_point(aes(col = species), * size = 2.5, shape = 1) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-6-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Scaling layer -- .pull-left[ - **Scaling** relates data values space to aesthetic values space ```r *vc <- c("#F8C759", "#5965AC", "#619961") ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + geom_point(aes(col = species), size = 2.5, shape = 1) + * scale_color_manual(values = vc) + scale_x_continuous( * limits = function(x) c(min(x), max(x)), * breaks = function(x) seq(quantile(x, 0.1), quantile(x, 0.9), length.out = 5)) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-8-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Using different themes -- .pull-left[ - Customize layout using **themes** ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + geom_point(aes(col = species), size = 2.5, shape = 1) + scale_color_manual(values = vc) + * labs(x = "Body mass [cm]", y = "Bill length [mm]", title = "Body - Bill Relationship") + * theme_classic() + * theme(legend.position = "bottom", plot.title = element_text( face = "bold", size = 16)) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-10-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Adding geoms -- .pull-left[ - Adding further **geometries** ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + geom_point(aes(col = species), size = 2.5, shape = 1) + * geom_smooth(aes(col = species), * method = "lm", se = FALSE) + * geom_label(data = df_text, * aes(x = mass_mean, y = bill_mean, * label = species)) + scale_color_manual(values = vc) + labs(x = "Body mass [cm]", y = "Bill length [mm]", title = "Body - Bill Relationship") + theme_classic() + theme(legend.position = "none", plot.title = element_text( face = "bold", size = 16)) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-12-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Facet wrapping -- .pull-left[ - Split into **panels** using grouping ```r ggplot(data = penguins, mapping = aes(x = body_mass_g, y = bill_length_mm)) + geom_point(aes(col = species), size = 2.5, shape = 1) + geom_smooth(aes(col = species), method = "lm", se = FALSE) + * facet_wrap(. ~ island) + scale_color_manual(values = vc) + labs(x = "Body mass [cm]", y = "Bill length [mm]", title = "Body - Bill Relationship") + theme_classic() + theme(legend.position = "bottom", plot.title = element_text( face = "bold", size = 16)) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-14-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Combining figures -- .pull-left[ - Figures can be stored in **variables** ```r library(cowplot) p1 <- ggplot(data = penguins, aes(x = species, body_mass_g)) + * geom_jitter() p2 <- ggplot(data = penguins, aes(x = species, body_mass_g)) + * geom_boxplot() p3 <- ggplot(data = penguins, aes(x = species, body_mass_g)) + * geom_violin() *cowplot::plot_grid(p1, p2, p3, nrow = 3) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-16-1.png" width="85%" style="display: block; margin: auto;" /> ] --- # More figures -- - Prepare data set with min, max, median body mass grouped by species and sex (tidy!) -- ```r penguins_range <- dplyr::group_by(penguins, species, sex) |> dplyr::summarise(min = min(body_mass_g), max = max(body_mass_g), med = median(body_mass_g), .groups = "drop") |> tidyr::pivot_longer(-c(species, sex)) |> tidyr::drop_na(value) head(penguins_range) ``` ``` ## # A tibble: 6 × 4 ## species sex name value ## <fct> <fct> <chr> <dbl> ## 1 Adelie female min 2850 ## 2 Adelie female max 3900 ## 3 Adelie female med 3400 ## 4 Adelie male min 3325 ## 5 Adelie male max 4775 ## 6 Adelie male med 4000 ``` --- # Grouping variables -- .pull-left[ - Use **group** to set interaction of all discrete variables ```r ggplot(data = penguins_range, aes(x = species, y = value, colour = sex) ) + geom_line( * aes(group = interaction(sex, name)), alpha = 0.15, size = 2.5 ) + geom_point( aes(shape = name, colour = sex), size = 5) + guides( * colour = guide_legend(title = "Sex", order = 2), shape = guide_legend( title = "Measurment", order = 1) ) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-19-1.png" width="85%" style="display: block; margin: auto;" /> ] --- # Segments -- .pull-left[ - `geom_segment()` allows to draw **lines** and curves ```r ggplot(data = df_sampled, aes(x = 1:n)) + geom_point(aes( y = bill_length_mm, col = species)) + * geom_segment( aes(xend = 1:n, y = 0, yend = bill_length_mm, col = species)) + geom_line( aes(y = bill_depth_mm), col = "#EEA236") + geom_hline(yintercept = 0) + * annotate(geom = "rect", xmin = 10.5, xmax = 14.5, ymin = 10.5, ymax = 13.5, fill = "black", alpha = 0.9) + * annotate(geom = "text", x = 12.5, y = 12, label = "Bill depth", color = "#EEA236") ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-21-1.png" width="85%" style="display: block; margin: auto;" /> ] --- # Densities -- .pull-left[ <!-- - `geom_histogram`/`geom_density` to draw distributions --> ```r ggplot(data = penguins) + geom_density( * aes(x = bill_depth_mm, y = ..density..), fill = "#69b3a2") + geom_histogram( aes(x = bill_depth_mm, y = ..density..), col = "black", fill = NA) + annotate( geom = "text", x = 35, y = 0.05, label = "bill_depth_mm", color = "#69b3a2") + geom_density( aes(x = bill_length_mm, y = -..density..), fill = "#404080") + geom_histogram( * aes(x = bill_length_mm, y = -..density..), col = "black", fill = NA) + annotate(geom = "text", x = 25, y = -0.05, label = "bill_length_mm", color = "#404080") + # scale_x_continuous(breaks = seq(10, 60, 5), limits = c(10, 60)) + xlab("bill dimensions") + theme_classic() ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-23-1.png" width="85%" style="display: block; margin: auto;" /> ] --- class: inverse, middle <img src="img/gganimate.png" width="100%" style="display: block; margin: auto;" /> .ref[Illustration from the Openscapes blog Tidy Data for reproducibility, efficiency, and collaboration by Julia Lowndes and Allison Horst] --- # Animated figures -- .pull-left[ - `gganimate` allows **dynamic** figures ```r library(gganimate) p <- ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) + geom_point(aes(color = species)) + geom_label(data = df_text, aes(x = mass_mean, y = bill_mean, label = species)) + scale_color_viridis_d(option = "C") + * transition_states(states = species, * transition_length = 2, * state_length = 1) + * enter_fade() + * exit_fade() + theme_classic() + theme(legend.position = "none") gganimate::animate(p) ``` ] -- .pull-right[ <img src="ggplot2_files/figure-html/unnamed-chunk-26-1.gif" width="85%" style="display: block; margin: auto;" /> ] --- class: inverse ## Thank you for your attention ### Questions? .pull-left[ Further resources: [https://mhesselbarth.github.io/advanced-r-workshop/resources](https://mhesselbarth.github.io/advanced-r-workshop/resources) Exercise: [https://mhesselbarth.github.io/advanced-r-workshop/exercise-ggplot](https://mhesselbarth.github.io/advanced-r-workshop/exercise-ggplot) ] .pull-right[ <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 64H48C21.49 64 0 85.49 0 112v288c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V112c0-26.51-21.49-48-48-48zm0 48v40.805c-22.422 18.259-58.168 46.651-134.587 106.49-16.841 13.247-50.201 45.072-73.413 44.701-23.208.375-56.579-31.459-73.413-44.701C106.18 199.465 70.425 171.067 48 152.805V112h416zM48 400V214.398c22.914 18.251 55.409 43.862 104.938 82.646 21.857 17.205 60.134 55.186 103.062 54.955 42.717.231 80.509-37.199 103.053-54.947 49.528-38.783 82.032-64.401 104.947-82.653V400H48z"></path></svg> [mhessel@umich.edu](mailto:mhessel@umich.edu) <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M336.5 160C322 70.7 287.8 8 248 8s-74 62.7-88.5 152h177zM152 256c0 22.2 1.2 43.5 3.3 64h185.3c2.1-20.5 3.3-41.8 3.3-64s-1.2-43.5-3.3-64H155.3c-2.1 20.5-3.3 41.8-3.3 64zm324.7-96c-28.6-67.9-86.5-120.4-158-141.6 24.4 33.8 41.2 84.7 50 141.6h108zM177.2 18.4C105.8 39.6 47.8 92.1 19.3 160h108c8.7-56.9 25.5-107.8 49.9-141.6zM487.4 192H372.7c2.1 21 3.3 42.5 3.3 64s-1.2 43-3.3 64h114.6c5.5-20.5 8.6-41.8 8.6-64s-3.1-43.5-8.5-64zM120 256c0-21.5 1.2-43 3.3-64H8.6C3.2 212.5 0 233.8 0 256s3.2 43.5 8.6 64h114.6c-2-21-3.2-42.5-3.2-64zm39.5 96c14.5 89.3 48.7 152 88.5 152s74-62.7 88.5-152h-177zm159.3 141.6c71.4-21.2 129.4-73.7 158-141.6h-108c-8.8 56.9-25.6 107.8-50 141.6zM19.3 352c28.6 67.9 86.5 120.4 158 141.6-24.4-33.8-41.2-84.7-50-141.6h-108z"></path></svg> [https://www.maxhesselbarth.com](https://www.maxhesselbarth.com) <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg> [@MHKHesselbarth](https://twitter.com/MHKHesselbarth) ]