Scatter stats
fleur.scatterstats.ScatterStats
Statistical correlation and plotting class for numerical variables.
Attributes:
Name | Type | Description |
---|---|---|
n_obs |
int
|
Total number of observations. |
alpha |
float
|
Probability of rejecting a true null hypothesis |
dof |
int
|
Degrees of freedom for t-test. |
pvalue |
float
|
P-value of the t-test. |
intercept |
float
|
The intercept (estimation of beta2) in the model. |
slope |
float
|
The slope (estimation of beta1) in the model. |
stderr_slope |
float
|
Standard error of the slope. |
ci_lower |
float
|
Lower bound of the confidence interval. |
ci_upper |
float
|
Upper bound of the confidence interval. |
ax |
Axes
|
The main matplotlib axes. |
fig |
Figure
|
The matplotlib figure. |
__init__(x, y, data=None, alternative='two-sided', correlation_measure='pearson', ci=95)
Initialize a ScatterStats()
instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Union[str, SeriesT, Iterable]
|
Colname of |
required |
y
|
Union[str, SeriesT, Iterable]
|
Colname of |
required |
data
|
Optional[Frame]
|
An optional dataframe. |
None
|
alternative
|
str
|
Defines the alternative hypothesis. Default is 'two-sided'. Must be one of 'two-sided', 'less' and 'greater'. |
'two-sided'
|
correlation_measure
|
str
|
The correlation measure to use. Default is 'pearson'. Must be one of 'pearson', 'kendall', 'spearman'. |
'pearson'
|
ci
|
Number
|
Confidence level for the top label and the regression plot. The default value is 95 (for a 95% confidence level). |
95
|
plot(*, marginal=True, bins=None, scatter_kws=None, line_kws=None, area_kws=None, hist_kws=None, subplot_mosaic_kwargs=None, show_stats=True)
Plot a scatter plot of two variables, with a linear regression line and annotate it with main statistical results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bins
|
Union[int, List[int]]
|
Number of bins for the marginal distributions. This can be an integer or a list of two integers (the first for the top distribution and the second for the other). |
None
|
marginal
|
bool
|
Whether to include marginal histograms. Default is |
True
|
line_kws
|
Union[dict, None]
|
Additional parameters which will be passed to the |
None
|
scatter_kws
|
Union[dict, None]
|
Additional parameters which will be passed to the |
None
|
area_kws
|
Union[dict, None]
|
Additional parameters which will be passed to the |
None
|
hist_kws
|
Union[dict, None]
|
Additional parameters which will be passed to the |
None
|
subplot_mosaic_kwargs
|
Union[dict, None]
|
Additional keyword arguments to pass to |
None
|
show_stats
|
bool
|
If True, display statistics on the plot. |
True
|
summary()
Print a text summary of the statistical test performed.
Displays the formatted test statistic with p-value, CI, etc.
Raises:
Type | Description |
---|---|
RuntimeError
|
If |
Examples
- Minimalist example
# mkdocs: render
from fleur import ScatterStats
from fleur import datasets
df = datasets.load_iris()
ScatterStats(x=df["sepal_length"], y=df["sepal_width"]).plot()