Bar stats
fleur.barstats.BarStats
Statistical comparison and plotting class for categorical data analysis.
This class provides functionality to visualize and statistically compare categorical data across groups. It supports chi-square tests for independence, Fisher's exact test for small samples, and creates nice-looking stacked or grouped bar charts.
Attributes:
Name | Type | Description |
---|---|---|
statistic |
float
|
The computed test statistic (chi-square or Fisher's exact). |
pvalue |
float
|
The p-value of the statistical test. |
main_stat |
str
|
The formatted test statistic string for display. |
expression |
str
|
Full LaTeX-style annotation string. |
test_name |
str
|
Name of the statistical test used. |
n_obs |
int
|
Total number of observations. |
n_cat |
int
|
Number of unique categories in the first variable. |
n_levels |
int
|
Number of unique levels in the second variable. |
contingency_table |
ndarray
|
The contingency table. |
expected_frequencies |
ndarray
|
Expected frequencies for chi-square test. |
cramers_v |
float
|
Cramer's V effect size measure. |
ax |
Axes
|
The matplotlib axes used for plotting. |
__init__(x, y, data=None, approach='freq', paired=False, thres_fisher=5, **kwargs)
Initialize a BarStats()
instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
str | SeriesT | Iterable
|
Colname of |
required |
y
|
str | SeriesT | Iterable
|
Colname of |
required |
data
|
Frame | None
|
An optional dataframe. |
None
|
approach
|
str
|
A character specifying the type of statistical approach: "freq" (default) or "bayes". |
'freq'
|
paired
|
bool
|
Whether comparing the same observations or not. |
False
|
thres_fisher
|
int
|
The threshold where you consider Chisquare assumptions violated. By default, if expected frequencies are below 5, then it will run a Fisher exact's test instead. Set to 0 to force using a Chisquare test. |
5
|
kwargs
|
Any
|
Additional arguments passed to the scipy test function. |
{}
|
plot(*, orientation='horizontal', colors=None, show_stats=True, show_counts=True, plot_type='stacked', ax=None, bar_kws=None)
Plot a statistical comparison bar chart for categorical data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
orientation
|
str
|
"vertical" or "horizontal" orientation of plots. |
'horizontal'
|
colors
|
list | None
|
List of colors for each group. |
None
|
show_stats
|
bool
|
If True, adds statistics on the plot. |
True
|
show_counts
|
bool
|
If True, shows sample counts in axis labels. |
True
|
plot_type
|
str
|
Type of bar chart ("stacked" or "grouped"). |
'stacked'
|
ax
|
Axes | None
|
Existing Axes to plot on. If None, uses current Axes. |
None
|
bar_kws
|
dict | None
|
Keyword args for bar plot customization. |
None
|
Returns:
Type | Description |
---|---|
Figure
|
A matplotlib Figure. |
Examples
- Minimalist example
# mkdocs: render
from fleur import BarStats
from fleur import data
df = data.load_mtcars("pandas")
BarStats(x="cyl", y="vs", data=df).plot()