plotnine.geoms.geom_boxplot

class plotnine.geoms.geom_boxplot(mapping=None, data=None, **kwargs)[source]

Box and whiskers plot

Usage

geom_boxplot(mapping=None, data=None, stat='boxplot', position='dodge2',
             na_rm=False, inherit_aes=True, show_legend=None, varwidth=False,
             notchwidth=0.5, width=None, outlier_size=1.5, outlier_alpha=1,
             fatten=2, outlier_stroke=0.5, outlier_color=None,
             outlier_shape='o', notch=False, **kwargs)

Only the mapping and data can be positional, the rest must be keyword arguments. **kwargs can be aesthetics (or parameters) used by the stat.

Parameters
mappingaes, optional

Aesthetic mappings created with aes(). If specified and inherit.aes=True, it is combined with the default mapping for the plot. You must supply mapping if there is no plot mapping.

Aesthetic

Default value

lower

middle

upper

x

ymax

ymin

alpha

1

color

'#333333'

fill

'white'

group

linetype

'solid'

shape

'o'

size

0.5

weight

1

The bold aesthetics are required.

datadataframe, optional

The data to be displayed in this layer. If None, the data from from the ggplot() call is used. If specified, it overrides the data from the ggplot() call.

statstr or stat, optional (default: stat_boxplot)

The statistical transformation to use on the data for this layer. If it is a string, it must be the registered and known to Plotnine.

positionstr or position, optional (default: position_dodge2)

Position adjustment. If it is a string, it must be registered and known to Plotnine.

na_rmbool, optional (default: False)

If False, removes missing values with a warning. If True silently removes missing values.

inherit_aesbool, optional (default: True)

If False, overrides the default aesthetics.

show_legendbool or dict, optional (default: None)

Whether this layer should be included in the legends. None the default, includes any aesthetics that are mapped. If a bool, False never includes and True always includes. A dict can be used to exclude specific aesthetis of the layer from showing in the legend. e.g show_legend={'color': False}, any other aesthetic are included by default.

widthfloat, optional (default None)

Box width. If None, the width is set to 90% of the resolution of the data. Note that if the stat has a width parameter, that takes precedence over this one.

outlier_alphafloat, optional (default: 1)

Transparency of the outlier points.

outlier_colorstr or tuple, optional (default: None)

Color of the outlier points.

outlier_shapestr, optional (default: o)

Shape of the outlier points. An empty string hides the outliers.

outlier_sizefloat, optional (default: 1.5)

Size of the outlier points.

outlier_strokefloat, optional (default: 0.5)

Stroke-size of the outlier points.

notchbool, optional (default: False)

Whether the boxes should have a notch.

varwidthbool, optional (default: False)

If True, boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.

notchwidthfloat, optional (default: 0.5)

Width of notch relative to the body width.

fattenfloat, optional (default: 2)

A multiplicative factor used to increase the size of the middle bar across the box.

Examples

[1]:
import pandas as pd
import numpy as np

from plotnine import *

%matplotlib inline

A box and whiskers plot

The boxplot compactly displays the distribution of a continuous variable.

Read more: + wikipedia + ggplot2 docs

[2]:
flights = pd.read_csv('data/flights.csv')
flights.head()
[2]:
year month passengers
0 1949 January 112
1 1949 February 118
2 1949 March 132
3 1949 April 129
4 1949 May 121

Basic boxplot

[3]:
months = [month[:3] for month in flights.month[:12]]
print(months)
['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
[4]:
(
    ggplot(flights)
    + geom_boxplot(aes(x='factor(month)', y='passengers'))
    + scale_x_discrete(labels=months, name='month')  # change ticks labels on OX
)
../_images/geom_boxplot_5_0.png
[4]:
<ggplot: (-9223363299599545919)>

Horizontal boxplot

[5]:
(
    ggplot(flights)
    + geom_boxplot(aes(x='factor(month)', y='passengers'))
    + coord_flip()
    + scale_x_discrete(
        labels=months[::-1],
        limits=flights.month[12::-1],
        name='month',
    )
)
../_images/geom_boxplot_7_0.png
[5]:
<ggplot: (-9223363299601758660)>

Boxplot with jittered points:

[6]:
(
    ggplot(flights, aes(x='factor(month)', y='passengers'))
    + geom_boxplot()
    + geom_jitter()
    + scale_x_discrete(labels=months, name='month')  # change ticks labels on OX
)
../_images/geom_boxplot_9_0.png
[6]:
<ggplot: (8737252966969)>