library(tidyverse)
library(ggchicklet)
library(ggtext)
library(lubridate)
library(kableExtra)
A Gantt Chart, Really?
I’m constantly pushing the limits of ggplot2
in unconventional ways, but every now and then there’s a practical use case scenario that also calls for some level of ggplot bending.
This week I was working on a slide deck for a client’s business plan, and like any good business plan, it needed a timeline. Cue every project manager’s favorite visual: the Gantt chart. It’s familiar and easy to read, outlining deliverables, milestones, and overlapping time spans.
After 10 minutes of Googling, I wasn’t impressed with the freemium tools I came across. That’s when I thought “well wait a second, couldn’t I build one wtih ggplot2
?” And that’s how this entire hacking experiment took off.
Initia Set Up
Here are the list of libraries I used.
tidyverse
: this includes ggplot2 and dplyr librariesggtext
: ggplot extension package for customizing textggchicklet
: a ggplot2 extension package for curved rectangleslubridate
: for date wranglingkableExtra
: preview data tables in this post
All About The Data
If you’re a veteran in the data viz world, you’re already well aware that the data pre-processing is usually the most time consuming step. This assignment was no exception…
Before we get into reshaping the data, let’s first take a look at the example data:
#fake data
<- read.csv("example_gantt.csv")|>
df#date formatting
mutate(start = as.Date(start, format='%m/%d/%y'),
end = as.Date(end, format = '%m/%d/%y'))
::kable(df) kableExtra
stage | item | start | end |
---|---|---|---|
Stage 1 | Here's an example | 2024-01-01 | 2024-02-01 |
Stage 1 | Of a gantt chart | 2024-01-15 | 2024-04-01 |
Stage 1 | Made with ggplot2 | 2024-03-01 | 2024-04-15 |
Stage 2 | I also used ggchicklet | 2024-05-01 | 2024-06-15 |
Stage 2 | For the rounded edges | 2024-05-15 | 2024-07-01 |
Pretty straight forward, we have a Stage name to break up different phases or stages, an item which is the equivalent of a deliverable, and start and end dates to describe the duration.
Starting off simple
In theory, we could keep this simple and use geom_segment to show durations for all items like this:
|>
dfggplot()+
geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end))
We can even reorder the items in the y axis to make sure it’s in ascending order by converting items as a factor variable.
|>
df::mutate(item = fct_reorder(item, desc(start)))|>
dplyrggplot()+
geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end))
Styling
Let’s keep building. I noticed Gantt charts often have the timeline labeled at the top. Here we’ll modify axis, the theme, and scales to make it look more “ganttish”:
|>
df::mutate(item = fct_reorder(item, desc(start)))|>
dplyrggplot()+
#timelines
geom_segment(mapping=aes(y=item, yend=item, x=start, xend=end),
linewidth=8, color='#DB504A')+
#position x axis top, reformat break labels
scale_x_date(position='top', date_breaks='1 month', date_labels="%b ' %y")+
#reformat plot labels
labs(x='', y='')+
#theme
theme_minimal()+
theme(
panel.grid.major.y = element_blank(),
axis.line.x = element_line(color='black'),
axis.text.y = ggtext::element_markdown(hjust=0),
plot.caption = element_markdown(),
axis.ticks = element_blank(),
text = element_text(family='Roboto Condensed')
)