plotnine.stats.stat_smooth

class plotnine.stats.stat_smooth(*args, **kwargs)[source]

Calculate a smoothed conditional mean

Usage

stat_smooth(mapping=None, data=None, geom='smooth', position='identity',
            na_rm=False, fullrange=False, n=80, span=0.75, method='auto',
            method_args={}, level=0.95, se=True, **kwargs)

Only the mapping and data can be positional, the rest must be keyword arguments. **kwargs can be aesthetics (or parameters) used by the geom.

Parameters:
mapping : aes, optional

Aesthetic mappings created with aes(). If specified and inherit.aes=True, it is combined with the default mapping for the plot. You must supply mapping if there is no plot mapping.

Aesthetic Default value
x  
y  

The bold aesthetics are required.

Options for computed aesthetics

'se'    # Standard error of points in bin
'ymin'  # Lower confidence limit
'ymax'  # Upper confidence limit

Calculated aesthetics are accessed using the calc function. e.g. 'stat(se)'.

data : dataframe, optional

The data to be displayed in this layer. If None, the data from from the ggplot() call is used. If specified, it overrides the data from the ggplot() call.

geom : str or stat, optional (default: smooth)

The statistical transformation to use on the data for this layer. If it is a string, it must be the registered and known to Plotnine.

position : str or position, optional (default: identity)

Position adjustment. If it is a string, it must be registered and known to Plotnine.

na_rm : bool, optional (default: False)

If False, removes missing values with a warning. If True silently removes missing values.

method : str or callable, optional (default: 'auto')

The available methods are:

'auto'       # Use loess if (n<1000), glm otherwise
'lm', 'ols'  # Linear Model
'wls'        # Weighted Linear Model
'rlm'        # Robust Linear Model
'glm'        # Generalized linear Model
'gls'        # Generalized Least Squares
'lowess'     # Locally Weighted Regression (simple)
'loess'      # Locally Weighted Regression
'mavg'       # Moving Average
'gpr'        # Gaussian Process Regressor

If a callable is passed, it must have the signature:

def my_smoother(data, xseq, **params):
    # * data - has the x and y values for the model
    # * xseq - x values to be predicted
    # * params - stat parameters
    #
    # It must return a new dataframe. Below is the
    # template used internally by Plotnine

    # Input data into the model
    x, y = data['x'], data['y']

    # Create and fit a model
    model = Model(x, y)
    results = Model.fit()

    # Create output data by getting predictions on
    # the xseq values
    data = pd.DataFrame({
        'x': xseq,
        'y': results.predict(xseq)})

    # Compute confidence intervals, this depends on
    # the model. However, given standard errors and the
    # degrees of freedom we can compute the confidence
    # intervals using the t-distribution.
    #
    # For an alternative, implement confidence interals by
    # the bootstrap method
    if params['se']:
        from plotnine.utils.smoothers import tdist_ci
        y = data['y']            # The predicted value
        df = 123                 # Degrees of freedom
        stderr = results.stderr  # Standard error
        level = params['level']  # The parameter value
        low, high = tdist_ci(y, df, stderr, level)
        data['se'] = stderr
        data['ymin'] = low
        data['ymax'] = high

    return data
se : bool (default: True)

If True draw confidence interval around the smooth line.

n : int (default: 80)

Number of points to evaluate the smoother at. Some smoothers like mavg do not support this.

fullrange : bool (default: False)

If True the fit will span the full range of the plot.

level : float (default: 0.95)

Level of confidence to use if se=True.

span : float (default: 2/3.)

Controls the amount of smoothing for the loess smoother. Larger number means more smoothing. It should be in the (0, 1) range.

method_args : dict (default: {})

Additional arguments passed on to the modelling method.

Notes

geom_smooth and stat_smooth are effectively aliases, they both use the same arguments. Use geom_smooth unless you want to display the results with a non-standard geom.