plotnine.stats.stat_smooth¶
- class plotnine.stats.stat_smooth(mapping: Aes | None = None, data: DataLike | None = None, **kwargs: Any)[source]¶
Calculate a smoothed conditional mean
Usage
stat_smooth(mapping=None, data=None, geom='smooth', position='identity', na_rm=False, se=True, formula=None, n=80, method='auto', method_args={}, level=0.95, span=0.75, fullrange=False, **kwargs)
Only the
mapping
anddata
can be positional, the rest must be keyword arguments.**kwargs
can be aesthetics (or parameters) used by thegeom
.- Parameters:
- mapping
aes
, optional Aesthetic mappings created with
aes()
. If specified andinherit.aes=True
, it is combined with the default mapping for the plot. You must supply mapping if there is no plot mapping.Aesthetic
Default value
x
y
The bold aesthetics are required.
Options for computed aesthetics
'se' # Standard error of points in bin 'ymin' # Lower confidence limit 'ymax' # Upper confidence limit
Calculated aesthetics are accessed using the after_stat function. e.g.
after_stat('se')
.- data
dataframe
, optional The data to be displayed in this layer. If
None
, the data from from theggplot()
call is used. If specified, it overrides the data from theggplot()
call.- geom
str
or geom, optional (default:geom_smooth
) The statistical transformation to use on the data for this layer. If it is a string, it must be the registered and known to Plotnine.
- position
str
or position, optional (default:position_identity
) Position adjustment. If it is a string, it must be registered and known to Plotnine.
- na_rmbool, optional (default:
False
) If
False
, removes missing values with a warning. IfTrue
silently removes missing values.- method
str
orcallable()
, optional (default: 'auto') The available methods are:
'auto' # Use loess if (n<1000), glm otherwise 'lm', 'ols' # Linear Model 'wls' # Weighted Linear Model 'rlm' # Robust Linear Model 'glm' # Generalized linear Model 'gls' # Generalized Least Squares 'lowess' # Locally Weighted Regression (simple) 'loess' # Locally Weighted Regression 'mavg' # Moving Average 'gpr' # Gaussian Process Regressor
If a callable is passed, it must have the signature:
def my_smoother(data, xseq, **params): # * data - has the x and y values for the model # * xseq - x values to be predicted # * params - stat parameters # # It must return a new dataframe. Below is the # template used internally by Plotnine # Input data into the model x, y = data['x'], data['y'] # Create and fit a model model = Model(x, y) results = Model.fit() # Create output data by getting predictions on # the xseq values data = pd.DataFrame({ 'x': xseq, 'y': results.predict(xseq)}) # Compute confidence intervals, this depends on # the model. However, given standard errors and the # degrees of freedom we can compute the confidence # intervals using the t-distribution. # # For an alternative, implement confidence interals by # the bootstrap method if params['se']: from plotnine.utils.smoothers import tdist_ci y = data['y'] # The predicted value df = 123 # Degrees of freedom stderr = results.stderr # Standard error level = params['level'] # The parameter value low, high = tdist_ci(y, df, stderr, level) data['se'] = stderr data['ymin'] = low data['ymax'] = high return data
For loess smoothing you must install the scikit-misc package. You can install it using with
pip install scikit-misc
orpip install plotnine[all]
.- formula
formula_like
An object that can be used to construct a patsy design matrix. This is usually a string. You can only use a formula if
method
is one of lm, ols, wls, glm, rlm or gls, and in the formula you may refer to thex
andy
aesthetic variables.- sebool (default:
True
) If
True
draw confidence interval around the smooth line.- n
int
(default: 80) Number of points to evaluate the smoother at. Some smoothers like mavg do not support this.
- fullrangebool (default:
False
) If
True
the fit will span the full range of the plot.- level
float
(default: 0.95) Level of confidence to use if
se=True
.- span
float
(default: 2/3.) Controls the amount of smoothing for the loess smoother. Larger number means more smoothing. It should be in the
(0, 1)
range.- method_args
dict
(default: {}) Additional arguments passed on to the modelling method.
- mapping
See also
statsmodels.regression.linear_model.OLS
statsmodels.regression.linear_model.WLS
statsmodels.robust.robust_linear_model.RLM
statsmodels.genmod.generalized_linear_model.GLM
statsmodels.regression.linear_model.GLS
statsmodels.nonparametric.smoothers_lowess.lowess
skmisc.loess.loess
pandas.DataFrame.rolling
sklearn.gaussian_process.GaussianProcessRegressor
Notes
geom_smooth
andstat_smooth
are effectively aliases, they both use the same arguments. Usegeom_smooth
unless you want to display the results with a non-standard geom.