Skip to content

Scatter stats

fleur.scatterstats.ScatterStats

Statistical correlation and plotting class for numerical variables.

Attributes:

Name Type Description
n_obs int

Total number of observations.

alpha float

Probability of rejecting a true null hypothesis

dof int

Degrees of freedom for t-test.

pvalue float

P-value of the t-test.

intercept float

The intercept (estimation of beta2) in the model.

slope float

The slope (estimation of beta1) in the model.

stderr_slope float

Standard error of the slope.

ci_lower float

Lower bound of the confidence interval.

ci_upper float

Upper bound of the confidence interval.

ax Axes

The main matplotlib axes.

fig Figure

The matplotlib figure.

__init__(x, y, data=None, alternative='two-sided', correlation_measure='pearson', ci=95)

Initialize a ScatterStats() instance.

Parameters:

Name Type Description Default
x Union[str, SeriesT, Iterable]

Colname of data or a Series or array-like.

required
y Union[str, SeriesT, Iterable]

Colname of data or a Series or array-like.

required
data Optional[Frame]

An optional dataframe.

None
alternative str

Defines the alternative hypothesis. Default is 'two-sided'. Must be one of 'two-sided', 'less' and 'greater'.

'two-sided'
correlation_measure str

The correlation measure to use. Default is 'pearson'. Must be one of 'pearson', 'kendall', 'spearman'.

'pearson'
ci Number

Confidence level for the top label and the regression plot. The default value is 95 (for a 95% confidence level).

95

plot(*, marginal=True, bins=None, scatter_kws=None, line_kws=None, area_kws=None, hist_kws=None, subplot_mosaic_kwargs=None, show_stats=True)

Plot a scatter plot of two variables, with a linear regression line and annotate it with main statistical results.

Parameters:

Name Type Description Default
bins Union[int, List[int]]

Number of bins for the marginal distributions. This can be an integer or a list of two integers (the first for the top distribution and the second for the other).

None
marginal bool

Whether to include marginal histograms. Default is True.

True
line_kws Union[dict, None]

Additional parameters which will be passed to the plot() function in matplotlib.

None
scatter_kws Union[dict, None]

Additional parameters which will be passed to the scatter() function in matplotlib.

None
area_kws Union[dict, None]

Additional parameters which will be passed to the fill_between() function in matplotlib.

None
hist_kws Union[dict, None]

Additional parameters which will be passed to the hist() function in matplotlib.

None
subplot_mosaic_kwargs Union[dict, None]

Additional keyword arguments to pass to plt.subplot_mosaic(). Default is None.

None
show_stats bool

If True, display statistics on the plot.

True

summary()

Print a text summary of the statistical test performed.

Displays the formatted test statistic with p-value, CI, etc.

Raises:

Type Description
RuntimeError

If plot() was not called before summary().


Examples

  • Minimalist example
# mkdocs: render
from fleur import ScatterStats
from fleur import datasets

df = datasets.load_iris()

ScatterStats(x=df["sepal_length"], y=df["sepal_width"]).plot()