Displaying Likert Style Responses

Surveys often contain responses to a given question in a Likert style format where the available responses are something like “Strongly Agree”, “Agree”, “Neutral”, “Disagree”, or “Strongly Disagree.” The following represents a “cheat-sheet” for using {ggplot} to display non-weighted Likert style survey responses in a bar chart format.

Packages

library(tidyverse)
library(gtsummary)
library(bstfun)

Bar charts for factored survey responses

Table 1: Example survey responses.
sample_question_1.factor	sample_question_2.factor	sample_question_3.factor
Moderately important	Somewhat effective	Somewhat effective
Extremely important	Somewhat effective	Very effective
Extremely important	Very effective	Very effective
Extremely important	Very effective	Very effective
Extremely important	Very effective	Very effective
Extremely important	Somewhat effective	Somewhat effective

Bar chart with perecentage of factored responses relative to the whole sample

This version calculates and displays the percent of responses from the entire survey sample. The key for this type of chart is to set group = 1 in the aes() call, set clip = “off” in the coord_flip() layer in combination with setting limits in the scale_y_continuous() layer to prevent the percent labels from getting clipped. The legend and x label (which is actually on the y axis as display because of coord_flip() have been removed for simplicity. Also, the theme() axis.text.x setting is in place in case the x-axis ticks need to be rotated to prevent overplotting. Requires:

One survey question with factored and ordered responses

data %>%
  drop_na(sample_question_1.factor) %>%
  ggplot(., aes(sample_question_1.factor, group = 1)) +
  geom_bar(aes(y = ..prop.., fill = factor(..x..)), position = position_dodge()) +
  geom_text(aes(label = scales::percent(..prop..), y= ..prop.. ), stat= "count", size = 3,
              hjust = -.15, colour = "black") +
  coord_flip(clip = "off") +
  scale_y_continuous(labels = scales::percent, limits = c(0, 1)) + # limits extends the chart to prevent clipping 
  labs(y = "Percent", x = "") +
  theme_minimal() +
  theme(legend.position = "top", axis.text.x = element_text(angle = -0, hjust = 0)) +
  guides(fill = "none")

Faceted bar chart faceted by a grouping variable

This version is an extension of basic bar chart, but adds the facet_wrap() layer to display responses to a grouping variable. In this example, the survey responses were collected from various “sites” that can be displayed separately. This option can be useful if the number of grouping variables is small. The two different geom_bar() layers control which variable to fill with color, either the response or the grouping variable (site).

Requires:

One survey question with factored and ordered responses
One categorical grouping variable such as (gender, site, age groups e.g. young/old, etc.)

data %>%
  drop_na(sample_question_1.factor, site) %>%
  ggplot(., aes(sample_question_1.factor, group = 1)) +
  #geom_bar(aes(y = ..prop.., fill = site), position = position_dodge()) + # Applies fill to site
  geom_bar(aes(y = ..prop.., fill = factor(..x..)), position = position_dodge()) + # Applies fill to response
  geom_text(aes(label = scales::percent(round(..prop..,2)), y= ..prop.. ), stat= "count", size = 3,
              hjust = -.15, colour = "black") +
  coord_flip(clip = "off") +
  scale_y_continuous(labels = scales::percent) +
  labs(y = "Percent", x = "") +
  theme_minimal() +
  theme(legend.position = "top", axis.text.x = element_text(angle = -0, hjust = 0)) +
  guides(fill = "none") +
  facet_wrap(~ site, ncol = 2)

Bar chart with perecentage of factored responses relative to a grouping variable

This style displays the same information as the faceted bar chart above, but places all of the bars in one panel. Again, this style works best when the number of values in a grouping variable is small to prevent over crowding each x-axis tick.

Requires:

One survey question with factored and ordered responses
One categorical grouping variable

data %>%
  drop_na(sample_question_1.factor, site) %>%
  ggplot(., aes(sample_question_1.factor, fill = site)) +
  geom_bar(aes(y=..count../tapply(..count.., ..fill.. ,sum)[..fill..]), position="dodge2") +
  geom_text(aes(y=..count../tapply(..count.., ..fill.. ,sum)[..fill..], label = scales::percent(round(..count../tapply(..count.., ..fill.. ,sum)[..fill..],2))), stat="count", position=position_dodge(1), hjust=-0.15, size = 3) +
  coord_flip(clip = "off") +
  scale_y_continuous(labels = scales::percent) +
  theme_minimal() +
  theme(legend.position = "top") +
  guides(fill = guide_legend(title = "Site")) +
  labs(y = "Percent", x = "")

Rather than faceting by a grouping variable, this style of chart will create multiple panels for multiple survey items. The key to this approach is to select the columns that are to be displayed and then converting the data to long format. Once the data are in long format, the remaining percentage calculations can be accomplished within the geom_bar() layer and displayed by the geom_text() layer. This style can be useful for displaying related survey items that a reader may want to compare.

Requires:

At least two survey questions with the same factored and ordered responses
One categorical grouping variable
Data arranged in long format

Table 2: Example long format data.
participant	question	response
1	sample_question_2.factor	Somewhat effective
1	sample_question_3.factor	Somewhat effective
2	sample_question_2.factor	Somewhat effective
2	sample_question_3.factor	Very effective
3	sample_question_2.factor	Very effective
3	sample_question_3.factor	Very effective

data %>% 
  select(sample_question_2.factor:sample_question_3.factor) %>%
  pivot_longer(cols = everything(),
               names_to = "question",
               values_to = "response") %>%
  drop_na() %>%
  ggplot(., aes(response, group = 1)) +
  geom_bar(aes(y = ..prop.., fill = factor(..x..)), position = position_dodge()) +
  geom_text(aes(label = scales::percent(..prop.., accuracy = 0.1L), y= ..prop.. ), stat= "count", size = 3,
              hjust = -.15, colour = "black") +
  scale_y_continuous(labels = scales::percent, limits = c(0, 1)) +
  coord_flip() +
  theme_minimal() +
  guides(fill = "none") +
  labs(y = "Percent", x = "") +
  facet_wrap(~question)

Responses in this type of char can be double checked with a call to tbl_likert() from the {bstfun} package. However, note that the rounding between tbl_likert() and the ggplot call will be slightly off due to rounding error.

data %>% 
  select(sample_question_2.factor:sample_question_3.factor) %>%
  tbl_likert(digits = list(everything() ~ 1))

Characteristic	Not effective¹	Somewhat effective¹	Very effective¹	Not sure¹
sample_question_2.factor	2.0 (1.4%)	58.0 (40.8%)	74.0 (52.1%)	8.0 (5.6%)
sample_question_3.factor	2.0 (1.4%)	54.0 (37.8%)	83.0 (58.0%)	4.0 (2.8%)
¹ n (%)

Building off of the previous chart, this style will facet two or more survey items and include the percent of responses relative to the grouping variable. Like the preceding chart, a key to this approach is convert the columns of interest into long format. From there, the next step is to group by all variables to count the number of responses. Next, we want to ungroup and drop the NAs followed by grouping again by the item (question) and grouping variable (site). Next, are two instances of using the mutate verb. One is to calculate the numerical percentage and the other is to create the label to display on top of the bar.

Requires:

At least two survey questions with the same factored and ordered responses
One categorical grouping variable
Data arranged in long format

data %>% 
  select(sample_question_2.factor:sample_question_3.factor, site) %>%
  pivot_longer(cols = sample_question_2.factor:sample_question_3.factor,
               names_to = "question",
               values_to = "response") %>%
  group_by(response, site, question) %>%
  summarise(freq = n(), .groups = "drop") %>%
  ungroup() %>%
  drop_na() %>%
  group_by(question, site) %>%
  mutate(prop = round(freq/sum(freq, na.rm = T),3) * 100) %>% # Get the % to plot
  mutate(prop_label = scales::percent(freq/sum(freq, na.rm = T), accuracy = 0.1L)) %>% # Get the % label
  ggplot(., aes(x = response, y = prop, fill = site, label = prop_label)) +
  geom_col(position = "dodge2") +
  geom_text(position = position_dodge(.9), size = 3, hjust = -.1) +
  coord_flip(clip = "off") +
  scale_y_continuous(labels = scales::percent_format(scale = 1), limits = c(0, 100)) +
  theme_minimal() +
  guides(fill = guide_legend(title = "Site")) +
  theme(legend.position = "top", axis.text.x = element_text(angle = -0, hjust = 0)) +
  labs(y = "Percent", x = "") +
  facet_wrap(~question)

To display a double check our work, we can use the tbl_summary() function from the {gtsummary} package. As noted before, the values may be slightly off due to rounding error.

data %>% 
  drop_na(site) %>%
  select(sample_question_2.factor:sample_question_3.factor, site) %>%
  tbl_summary(by = "site",
              digits = all_categorical() ~ 1)

Characteristic	Albuquerque, N = 52¹	San Diego, N = 23¹	Denver, N = 35¹	El Paso, N = 17¹	Los Angeles, N = 15¹
sample_question_2.factor
Not effective	0.0 (0.0%)	1.0 (4.3%)	1.0 (3.0%)	0.0 (0.0%)	0.0 (0.0%)
Somewhat effective	21.0 (40.4%)	13.0 (56.5%)	11.0 (33.3%)	8.0 (47.1%)	4.0 (26.7%)
Very effective	29.0 (55.8%)	6.0 (26.1%)	20.0 (60.6%)	7.0 (41.2%)	11.0 (73.3%)
Not sure	2.0 (3.8%)	3.0 (13.0%)	1.0 (3.0%)	2.0 (11.8%)	0.0 (0.0%)
Unknown	0	0	2	0	0
sample_question_3.factor
Not effective	0.0 (0.0%)	1.0 (4.3%)	1.0 (2.9%)	0.0 (0.0%)	0.0 (0.0%)
Somewhat effective	25.0 (48.1%)	12.0 (52.2%)	9.0 (26.5%)	6.0 (35.3%)	2.0 (13.3%)
Very effective	27.0 (51.9%)	9.0 (39.1%)	22.0 (64.7%)	10.0 (58.8%)	13.0 (86.7%)
Not sure	0.0 (0.0%)	1.0 (4.3%)	2.0 (5.9%)	1.0 (5.9%)	0.0 (0.0%)
Unknown	0	0	1	0	0
¹ n (%)

Bar charts for numeric survey responses

Bar chart with means of numeric responses

In some cases, the responses of survey items may be represented by integers (i.e. 1, 2, 3, 4, 5) and it may be useful to plot the means of these responses. The key to this style of plot is to summarize each question into its respective mean before arranging the data into long format (if more than one question).

Requires:

One or more survey questions with numerical responses
Each question summarized to a mean
Data arranged in long format if more than one question

Table 3: Example survey responses.
sample_question_1.integer	sample_question_2.integer	sample_question_3.integer	sample_question_4.integer
3	4	4	3
4	5	5	5
3	4	4	4
3	4	4	3
3	3	4	3
4	3	3	3

data %>% 
  select(sample_question_1.integer:sample_question_4.integer) %>%
  summarise(across(everything(), ~ mean(.x, na.rm = TRUE))) %>%
  pivot_longer(cols = sample_question_1.integer:sample_question_4.integer,
               names_to = "question",
               values_to = "mean") %>%
  ggplot(., aes(x = factor(question, levels = rev(question)), y = mean, fill = question)) +
  geom_col() +
  geom_text(aes(label = round(mean,2)), hjust = -.3, size = 3) +
  geom_hline(aes(yintercept = mean(mean)), color = "black", linetype = "dotted") +
  coord_flip() +
  scale_y_continuous(limits = c(0,5), breaks = scales::breaks_pretty(11)) +
  theme_minimal() +
  guides(fill = "none") +
  labs(y = "Survey item mean", x = "")

data %>% 
  select(sample_question_1.integer:sample_question_4.integer) %>%
  summarise(across(everything(), ~ mean(.x, na.rm = TRUE))) %>%
  pivot_longer(cols = sample_question_1.integer:sample_question_4.integer,
               names_to = "question",
               values_to = "mean")

Table 4: Example summarized responses.
question	mean
sample_question_1.integer	3.152778
sample_question_2.integer	3.326389
sample_question_3.integer	3.496552
sample_question_4.integer	2.439716

Last updated on Oct 8, 2023

Displaying Likert Style Responses

Packages

Bar charts for factored survey responses

Bar chart with perecentage of factored responses relative to the whole sample

Faceted bar chart faceted by a grouping variable

Bar chart with perecentage of factored responses relative to a grouping variable

Bar chart faceted by related survey items

Bar chart with grouping variable and faceted by related survey item

Bar charts for numeric survey responses

Bar chart with means of numeric responses