Show counts and percentages for bar plotsΒΆ

In [1]:
import pandas as pd
from plotnine import *
from plotnine.data import mtcars

%matplotlib inline

We can plot a bar graph and easily show the counts for each bar

In [8]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(count)'),
     stat='count',
     nudge_y=0.125,
     va='bottom'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_3_0.png
Out[8]:
<ggplot: (97654321012345679)>

stat_count also calculates proportions (as prop) and a proportion can be converted to a percentage.

In [10]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(prop)*100'),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     format_string='{:.1f}% '
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_5_0.png
Out[10]:
<ggplot: (97654321012345679)>

These are clearly wrong percentages. The system puts each bar in a separate group. We need to tell it to put all bar in the panel in single group, so that the percentage are what we expect.

In [4]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(prop)*100', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     format_string='{:.1f}%'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_7_0.png
Out[4]:
<ggplot: (97654321012345679)>

For more on why automatic grouping may work the way you want, see this tutorial.

We can get the counts and we can get the percentages we need to print both. We can do that in two ways,

  1. Using two geom_text layers.
In [5]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(count)'),
     stat='count',
     nudge_x=-0.14,
     nudge_y=0.125,
     va='bottom'
 )
 + geom_text(
     aes(label='stat(prop)*100', group=1),
     stat='count',
     nudge_x=0.14,
     nudge_y=0.125,
     va='bottom',
     format_string='({:.1f}%)'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_9_0.png
Out[5]:
<ggplot: (97654321012345679)>
  1. Using a function to combine the counts and percentages
In [6]:
def combine(counts, percentages):
    fmt = '{} ({:.1f}%)'.format
    return [fmt(c, p) for c, p in zip(counts, percentages)]


(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(combine(count, 100*prop))', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_11_0.png
Out[6]:
<ggplot: (97654321012345679)>

It works with facetting.

In [7]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(combine(count, 100*prop))', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     size=9
 )
 + facet_wrap('am')
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_13_0.png
Out[7]:
<ggplot: (97654321012345679)>

Credit: This example was motivated by the github user Fandekasp (Adrien Lemaire) and difficulty he faced in displaying percentages of bar plots.