Gantt Chart with ggplot2

How to hack a gantt chart with ggplot2

October 17, 2023

A Gantt Chart, Really?

I’m constantly pushing the limits of ggplot2 in unconventional ways, but every now and then there’s a practical use case scenario that also calls for some level of ggplot bending.

This week I was working on a slide deck for a client’s business plan, and like any good business plan, it needed a timeline. Cue every project manager’s favorite visual: the Gantt chart. It’s familiar and easy to read, outlining deliverables, milestones, and overlapping time spans.

After 10 minutes of Googling, I wasn’t impressed with the freemium tools I came across. That’s when I thought “well wait a second, couldn’t I build one wtih ggplot2?” And that’s how this entire hacking experiment took off.

Initia Set Up

Here are the list of libraries I used.

  • tidyverse: this includes ggplot2 and dplyr libraries
  • ggtext: ggplot extension package for customizing text
  • ggchicklet: a ggplot2 extension package for curved rectangles
  • lubridate: for date wrangling
  • kableExtra: preview data tables in this post

All About The Data

If you’re a veteran in the data viz world, you’re already well aware that the data pre-processing is usually the most time consuming step. This assignment was no exception…

Before we get into reshaping the data, let’s first take a look at the example data:

#fake data
df<- read.csv("example_gantt.csv")|>
  #date formatting
  mutate(start = as.Date(start, format='%m/%d/%y'),
         end = as.Date(end, format = '%m/%d/%y'))

stage item start end
Stage 1 Here's an example 2024-01-01 2024-02-01
Stage 1 Of a gantt chart 2024-01-15 2024-04-01
Stage 1 Made with ggplot2 2024-03-01 2024-04-15
Stage 2 I also used ggchicklet 2024-05-01 2024-06-15
Stage 2 For the rounded edges 2024-05-15 2024-07-01

Pretty straight forward, we have a Stage name to break up different phases or stages, an item which is the equivalent of a deliverable, and start and end dates to describe the duration.

Starting off simple

In theory, we could keep this simple and use geom_segment to show durations for all items like this:

  geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end))

We can even reorder the items in the y axis to make sure it’s in ascending order by converting items as a factor variable.

dplyr::mutate(item = fct_reorder(item, desc(start)))|>
  geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end))


Let’s keep building. I noticed Gantt charts often have the timeline labeled at the top. Here we’ll modify axis, the theme, and scales to make it look more “ganttish”:

dplyr::mutate(item = fct_reorder(item, desc(start)))|>
  geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end),
               linewidth=8, color='#DB504A')+
  #position x axis top, reformat break labels
  scale_x_date(position='top', date_breaks='1 month', date_labels="%b ' %y")+
  #reformat plot labels
  labs(x='', y='')+
    panel.grid.major.y = element_blank(),
    axis.line.x = element_line(color='black'),
    axis.text.y = ggtext::element_markdown(hjust=0),
    plot.caption = element_markdown(),
    axis.ticks = element_blank(),
    text = element_text(family='Roboto Condensed')