Show counts and percentages for bar plotsΒΆ

[1]:
import pandas as pd
from plotnine import *
from plotnine.data import mtcars

%matplotlib inline

We can plot a bar graph and easily show the counts for each bar

[8]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(count)'),
     stat='count',
     nudge_y=0.125,
     va='bottom'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_3_0.png
[8]:
<ggplot: (97654321012345679)>

stat_count also calculates proportions (as prop) and a proportion can be converted to a percentage.

[10]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(prop)*100'),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     format_string='{:.1f}% '
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_5_0.png
[10]:
<ggplot: (97654321012345679)>

These are clearly wrong percentages. The system puts each bar in a separate group. We need to tell it to put all bar in the panel in single group, so that the percentage are what we expect.

[4]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(prop)*100', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     format_string='{:.1f}%'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_7_0.png
[4]:
<ggplot: (97654321012345679)>

For more on why automatic grouping may work the way you want, see this tutorial.

We can get the counts and we can get the percentages we need to print both. We can do that in two ways,

  1. Using two geom_text layers.

[5]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(count)'),
     stat='count',
     nudge_x=-0.14,
     nudge_y=0.125,
     va='bottom'
 )
 + geom_text(
     aes(label='stat(prop)*100', group=1),
     stat='count',
     nudge_x=0.14,
     nudge_y=0.125,
     va='bottom',
     format_string='({:.1f}%)'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_9_0.png
[5]:
<ggplot: (97654321012345679)>
  1. Using a function to combine the counts and percentages

[6]:
def combine(counts, percentages):
    fmt = '{} ({:.1f}%)'.format
    return [fmt(c, p) for c, p in zip(counts, percentages)]


(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(combine(count, 100*prop))', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom'
 )
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_11_0.png
[6]:
<ggplot: (97654321012345679)>

It works with facetting.

[7]:
(ggplot(mtcars, aes('factor(cyl)', fill='factor(cyl)'))
 + geom_bar()
 + geom_text(
     aes(label='stat(combine(count, 100*prop))', group=1),
     stat='count',
     nudge_y=0.125,
     va='bottom',
     size=9
 )
 + facet_wrap('am')
)
../_images/tutorials_miscellaneous-show-counts-and-percentages-for-bar-plots_13_0.png
[7]:
<ggplot: (97654321012345679)>

Credit: This example was motivated by the github user Fandekasp (Adrien Lemaire) and difficulty he faced in displaying percentages of bar plots.