TSE:ISSS608 - Hands-on Exercise 4 (2024)

Building Funnel Plot with R

Author

Teo Suan Ern

Published

January 13, 2024

Modified

February 24, 2024

Note: Last modified to include author’s details.

This hands-on exercise 4 is split into four segments:

  1. Visualising Distribution

  2. Visual Statistical Analysis

  3. Visualising Uncertainty

  4. Building Funnel Plot with R

1.1 Install and launch R packages

For the purpose of this exercise, the following R packages will be used, they are:

pacman::p_load(tidyverse, FunnelPlotR, plotly, knitr)

1.2 Importing the data

COVID-19_DKI_Jakarta will be used

Show code
covid19 <- read_csv("data/COVID-19_DKI_Jakarta.csv") %>% mutate_if(is.character, as.factor)

1.3 Overview of the data

Show code
summary(covid19)
 Sub-district ID City District Min. :3.101e+09 JAKARTA BARAT :56 TAMBORA : 11 1st Qu.:3.172e+09 JAKARTA PUSAT :44 KEBAYORAN BARU: 10 Median :3.173e+09 JAKARTA SELATAN :65 CIPAYUNG : 8 Mean :3.172e+09 JAKARTA TIMUR :65 JATINEGARA : 8 3rd Qu.:3.174e+09 JAKARTA UTARA :31 KEMAYORAN : 8 Max. :3.175e+09 KAB.ADM.KEP.SERIBU: 6 SETIA BUDI : 8 (Other) :214 Sub-district Positive Recovered Death ANCOL : 1 Min. : 72 Min. : 69 Min. : 0.00 ANGKE : 1 1st Qu.:1644 1st Qu.:1578 1st Qu.: 24.50 BALE KAMBANG: 1 Median :2420 Median :2329 Median : 39.00 BALI MESTER : 1 Mean :2572 Mean :2477 Mean : 40.99 BAMBU APUS : 1 3rd Qu.:3372 3rd Qu.:3242 3rd Qu.: 55.00 BANGKA : 1 Max. :6231 Max. :5970 Max. :158.00 (Other) :261 

FunnelPlotR package uses ggplot to generate funnel plots. It requires a numerator (events of interest), denominator (population to be considered) and group. The key arguments selected for customisation are:

  • limit: plot limits (95 or 99).

  • label_outliers: to label outliers (true or false).

  • Poisson_limits: to add Poisson limits to the plot.

  • OD_adjust: to add overdispersed limits to the plot.

  • xrange and yrange: to specify the range to display for axes, acts like a zoom function.

  • Other aesthetic components such as graph title, axis labels etc.

Show code
funnel_plot( numerator = covid19$Positive, denominator = covid19$Death, group = covid19$`Sub-district`)

TSE:ISSS608 - Hands-on Exercise 4 (1)

A funnel plot object with 267 points of which 0 are outliers. Plot is adjusted for overdispersion. 

A funnel plot object with 267 points of which 0 are outliers. Plot is adjusted for overdispersion.

Learning Point

  • group in this function is different from the scatterplot. Here, it defines the level of the points to be plotted i.e.Sub-district, District or City. If Cityc is chosen, there are only six data points.

  • By default, data_typeargument is “SR”.

  • limit: Plot limits, accepted values are: 95 or 99, corresponding to 95% or 99.8% quantiles of the distribution.

Show code
funnel_plot( numerator = covid19$Death, denominator = covid19$Positive, group = covid19$`Sub-district`, data_type = "PR", #<< xrange = c(0, 6500), #<< yrange = c(0, 0.05) #<<)

TSE:ISSS608 - Hands-on Exercise 4 (2)

A funnel plot object with 267 points of which 7 are outliers. Plot is adjusted for overdispersion. 

Learning Point

  • data_type argument is used to change from default “SR” to “PR” (i.e.proportions).

  • xrange and yrange are used to set the range of x-axis and y-axis

Show code
funnel_plot( numerator = covid19$Death, denominator = covid19$Positive, group = covid19$`Sub-district`, data_type = "PR",  xrange = c(0, 6500),  yrange = c(0, 0.05), label = NA, title = "Cumulative COVID-19 Fatality Rate by Cumulative Total Number of COVID-19 Positive Cases", #<<  x_label = "Cumulative COVID-19 Positive Cases", #<< y_label = "Cumulative Fatality Rate" #<<)

TSE:ISSS608 - Hands-on Exercise 4 (3)

A funnel plot object with 267 points of which 7 are outliers. Plot is adjusted for overdispersion. 
Show code
funnel_plot( numerator = covid19$Death, denominator = covid19$Positive, group = covid19$`Sub-district`, data_type = "PR",  xrange = c(0, 6500),  yrange = c(0, 0.05), label = NA)

TSE:ISSS608 - Hands-on Exercise 4 (4)

A funnel plot object with 267 points of which 7 are outliers. Plot is adjusted for overdispersion. 

3.1 Basic derived fields

Show code
df <- covid19 %>% mutate(rate = Death / Positive) %>% mutate(rate.se = sqrt((rate*(1-rate)) / (Positive))) %>% filter(rate > 0)

fit.meanis computed by using the code chunk below.

fit.mean <- weighted.mean(df$rate, 1/df$rate.se^2)

3.2 Lower/ Upper Limits 95% & 99% CI

Show code
number.seq <- seq(1, max(df$Positive), 1)number.ll95 <- fit.mean - 1.96 * sqrt((fit.mean*(1-fit.mean)) / (number.seq)) number.ul95 <- fit.mean + 1.96 * sqrt((fit.mean*(1-fit.mean)) / (number.seq)) number.ll999 <- fit.mean - 3.29 * sqrt((fit.mean*(1-fit.mean)) / (number.seq)) number.ul999 <- fit.mean + 3.29 * sqrt((fit.mean*(1-fit.mean)) / (number.seq)) dfCI <- data.frame(number.ll95, number.ul95, number.ll999,  number.ul999, number.seq, fit.mean)

3.3 Static funnel plot

Show code
p <- ggplot(df, aes(x = Positive, y = rate)) + geom_point(aes(label=`Sub-district`),  alpha=0.4) + geom_line(data = dfCI,  aes(x = number.seq,  y = number.ll95),  size = 0.4,  colour = "grey40",  linetype = "dashed") + geom_line(data = dfCI,  aes(x = number.seq,  y = number.ul95),  size = 0.4,  colour = "grey40",  linetype = "dashed") + geom_line(data = dfCI,  aes(x = number.seq,  y = number.ll999),  size = 0.4,  colour = "grey40") + geom_line(data = dfCI,  aes(x = number.seq,  y = number.ul999),  size = 0.4,  colour = "grey40") + geom_hline(data = dfCI,  aes(yintercept = fit.mean),  size = 0.4,  colour = "grey40") + coord_cartesian(ylim=c(0,0.05)) + annotate("text", x = 1, y = -0.13, label = "95%", size = 3, colour = "grey40") +  annotate("text", x = 4.5, y = -0.18, label = "99%", size = 3, colour = "grey40") +  ggtitle("Cumulative Fatality Rate by Cumulative Number of COVID-19 Cases") + xlab("Cumulative Number of COVID-19 Cases") +  ylab("Cumulative Fatality Rate") + theme_light() + theme(plot.title = element_text(size=12), legend.position = c(0.91,0.85),  legend.title = element_text(size=7), legend.text = element_text(size=7), legend.background = element_rect(colour = "grey60", linetype = "dotted"), legend.key.height = unit(0.3, "cm"))p

TSE:ISSS608 - Hands-on Exercise 4 (5)

3.4 Interactive Funnel Plot: plotly + ggplot2

Funnel plot created using ggplot2 functions can be made interactive withggplotly()ofplotlyr package.

Show code
fp_ggplotly <- ggplotly(p, tooltip = c("label",  "x",  "y"))fp_ggplotly

12 Funnel Plots for Fair Comparisons

Back to top
TSE:ISSS608 - Hands-on Exercise 4 (2024)
Top Articles
Latest Posts
Article information

Author: Ouida Strosin DO

Last Updated:

Views: 5671

Rating: 4.6 / 5 (56 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Ouida Strosin DO

Birthday: 1995-04-27

Address: Suite 927 930 Kilback Radial, Candidaville, TN 87795

Phone: +8561498978366

Job: Legacy Manufacturing Specialist

Hobby: Singing, Mountain biking, Water sports, Water sports, Taxidermy, Polo, Pet

Introduction: My name is Ouida Strosin DO, I am a precious, combative, spotless, modern, spotless, beautiful, precious person who loves writing and wants to share my knowledge and understanding with you.