plotnine.geoms.geom_boxplot¶
- class plotnine.geoms.geom_boxplot(mapping: Aes | None = None, data: DataLike | None = None, **kwargs: Any)[source]¶
Box and whiskers plot
Usage
geom_boxplot(mapping=None, data=None, stat='boxplot', position='dodge2', na_rm=False, inherit_aes=True, show_legend=None, raster=False, outlier_alpha=1, varwidth=False, outlier_size=1.5, outlier_color=None, outlier_shape='o', notch=False, fatten=2, outlier_stroke=0.5, width=None, notchwidth=0.5, **kwargs)
Only the
data
andmapping
can be positional, the rest must be keyword arguments.**kwargs
can be aesthetics (or parameters) used by thestat
.- Parameters:
- mapping
aes
, optional Aesthetic mappings created with
aes()
. If specified andinherit.aes=True
, it is combined with the default mapping for the plot. You must supply mapping if there is no plot mapping.Aesthetic
Default value
lower
middle
upper
x
ymax
ymin
alpha
1
color
'#333333'
fill
'white'
group
linetype
'solid'
shape
'o'
size
0.5
weight
1
The bold aesthetics are required.
- data
dataframe
, optional The data to be displayed in this layer. If
None
, the data from from theggplot()
call is used. If specified, it overrides the data from theggplot()
call.- stat
str
or stat, optional (default:stat_boxplot
) The statistical transformation to use on the data for this layer. If it is a string, it must be the registered and known to Plotnine.
- position
str
or position, optional (default:position_dodge2
) Position adjustment. If it is a string, it must be registered and known to Plotnine.
- na_rmbool, optional (default:
False
) If
False
, removes missing values with a warning. IfTrue
silently removes missing values.- inherit_aesbool, optional (default:
True
) If
False
, overrides the default aesthetics.- show_legendbool or
dict
, optional (default:None
) Whether this layer should be included in the legends.
None
the default, includes any aesthetics that are mapped. If abool
,False
never includes andTrue
always includes. Adict
can be used to exclude specific aesthetis of the layer from showing in the legend. e.gshow_legend={'color': False}
, any other aesthetic are included by default.- rasterbool, optional (default:
False
) If
True
, draw onto this layer a raster (bitmap) object even ifthe final image is in vector format.- width
float
, optional (defaultNone
) Box width. If
None
, the width is set to 90% of the resolution of the data. Note that if the stat has a width parameter, that takes precedence over this one.- outlier_alpha
float
, optional (default: 1) Transparency of the outlier points.
- outlier_color
str
ortuple
, optional (default:None
) Color of the outlier points.
- outlier_shape
str
, optional (default:o
) Shape of the outlier points. An empty string hides the outliers.
- outlier_size
float
, optional (default: 1.5) Size of the outlier points.
- outlier_stroke
float
, optional (default: 0.5) Stroke-size of the outlier points.
- notchbool, optional (default:
False
) Whether the boxes should have a notch.
- varwidthbool, optional (default:
False
) If
True
, boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.- notchwidth
float
, optional (default: 0.5) Width of notch relative to the body width.
- fatten
float
, optional (default: 2) A multiplicative factor used to increase the size of the middle bar across the box.
- mapping
Examples¶
[1]:
import pandas as pd
import numpy as np
from plotnine import (
ggplot,
aes,
geom_boxplot,
geom_jitter,
scale_x_discrete,
coord_flip
)
A box and whiskers plot¶
The boxplot compactly displays the distribution of a continuous variable.
Read more: + wikipedia + ggplot2 docs
[2]:
flights = pd.read_csv('data/flights.csv')
flights.head()
[2]:
year | month | passengers | |
---|---|---|---|
0 | 1949 | January | 112 |
1 | 1949 | February | 118 |
2 | 1949 | March | 132 |
3 | 1949 | April | 129 |
4 | 1949 | May | 121 |
Basic boxplot
[3]:
months = [month[:3] for month in flights.month[:12]]
print(months)
['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
[4]:
(
ggplot(flights)
+ geom_boxplot(aes(x='factor(month)', y='passengers'))
+ scale_x_discrete(labels=months, name='month') # change ticks labels on OX
)

[4]:
<Figure Size: (640 x 480)>
Horizontal boxplot
[5]:
(
ggplot(flights)
+ geom_boxplot(aes(x='factor(month)', y='passengers'))
+ coord_flip()
+ scale_x_discrete(
labels=months[::-1],
limits=flights.month[12::-1],
name='month',
)
)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj)
701 stream = StringIO()
702 printer = pretty.RepresentationPrinter(stream, self.verbose,
703 self.max_width, self.newline,
704 max_seq_length=self.max_seq_length,
705 singleton_pprinters=self.singleton_printers,
706 type_pprinters=self.type_printers,
707 deferred_pprinters=self.deferred_printers)
--> 708 printer.pretty(obj)
709 printer.flush()
710 return stream.getvalue()
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
407 return meth(obj, self, cycle)
408 if cls is not object \
409 and callable(cls.__dict__.get('__repr__')):
--> 410 return _repr_pprint(obj, self, cycle)
412 return _default_pprint(obj, self, cycle)
413 finally:
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
776 """A pprint that just redirects to the normal repr function."""
777 # Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
779 lines = output.splitlines()
780 with p.group():
File ~/scm/python/plotnine/plotnine/ggplot.py:114, in ggplot.__repr__(self)
110 def __repr__(self) -> str:
111 """
112 Print/show the plot
113 """
--> 114 figure = self.draw(show=True)
116 dpi = figure.get_dpi()
117 W = int(figure.get_figwidth() * dpi)
File ~/scm/python/plotnine/plotnine/ggplot.py:234, in ggplot.draw(self, show)
232 # Drawing
233 self._draw_layers()
--> 234 self._draw_breaks_and_labels()
235 self._draw_legend()
236 self._draw_figure_texts()
File ~/scm/python/plotnine/plotnine/ggplot.py:415, in ggplot._draw_breaks_and_labels(self)
413 ax = self.axs[pidx]
414 panel_params = self.layout.panel_params[pidx]
--> 415 self.facet.set_limits_breaks_and_labels(panel_params, ax)
417 # Remove unnecessary ticks and labels
418 if not layout_info.axis_x:
File ~/scm/python/plotnine/plotnine/facets/facet.py:328, in facet.set_limits_breaks_and_labels(self, panel_params, ax)
326 # breaks, labels
327 ax.set_xticks(panel_params.x.breaks, panel_params.x.labels)
--> 328 ax.set_yticks(panel_params.y.breaks, panel_params.y.labels)
330 # minor breaks
331 ax.set_xticks(panel_params.x.minor_breaks, minor=True)
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/matplotlib/axes/_base.py:74, in _axis_method_wrapper.__set_name__.<locals>.wrapper(self, *args, **kwargs)
73 def wrapper(self, *args, **kwargs):
---> 74 return get_method(self)(*args, **kwargs)
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/matplotlib/axis.py:2076, in Axis.set_ticks(self, ticks, labels, minor, **kwargs)
2074 result = self._set_tick_locations(ticks, minor=minor)
2075 if labels is not None:
-> 2076 self.set_ticklabels(labels, minor=minor, **kwargs)
2077 return result
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/matplotlib/_api/deprecation.py:297, in rename_parameter.<locals>.wrapper(*args, **kwargs)
292 warn_deprecated(
293 since, message=f"The {old!r} parameter of {func.__name__}() "
294 f"has been renamed {new!r} since Matplotlib {since}; support "
295 f"for the old name will be dropped %(removal)s.")
296 kwargs[new] = kwargs.pop(old)
--> 297 return func(*args, **kwargs)
File ~/.pyenv/versions/3.11.1/envs/plotnine/lib/python3.11/site-packages/matplotlib/axis.py:1969, in Axis.set_ticklabels(self, labels, minor, fontdict, **kwargs)
1965 if isinstance(locator, mticker.FixedLocator):
1966 # Passing [] as a list of labels is often used as a way to
1967 # remove all tick labels, so only error for > 0 labels
1968 if len(locator.locs) != len(labels) and len(labels) != 0:
-> 1969 raise ValueError(
1970 "The number of FixedLocator locations"
1971 f" ({len(locator.locs)}), usually from a call to"
1972 " set_ticks, does not match"
1973 f" the number of labels ({len(labels)}).")
1974 tickd = {loc: lab for loc, lab in zip(locator.locs, labels)}
1975 func = functools.partial(self._format_with_dict, tickd)
ValueError: The number of FixedLocator locations (13), usually from a call to set_ticks, does not match the number of labels (12).
Boxplot with jittered points:
[6]:
(
ggplot(flights, aes(x='factor(month)', y='passengers'))
+ geom_boxplot()
+ geom_jitter()
+ scale_x_discrete(labels=months, name='month') # change ticks labels on OX
)

[6]:
<Figure Size: (640 x 480)>