chowclassifier package¶

Submodules¶

chowclassifier.chow module¶

Implementation of a Chow Classifier

class chowclassifier.chow.Chow(df, name: str = '', initial_breakpoint: float = None, timecol: str = 'year', ycol: str = 'value', groupcol: str = 'g', alpha=0.01, margin: int = 2)¶

Bases: object

Class handling the chow analysis Call run() to run the analysis (find breakpoint and classify) Call afterwards params() to get the regressions (y=k * x + b) parameters as [[model1_k, model1_b, model1_Rsquared],[model2_k, model2_b, model2_Rsquared]] (if the breakpoint is not significant, model1 = model2) The significance level of the breakpoint is adjusted with Bonferroni correction, i.e. dividing alpha by the number of tests. The confidence interval of the slope and intercept are at significance level alpha (not corrected).

Parameters:

df – data on which to perform the Chow categorisation
name – name of the dataset (optional)
initial_breakpoint – breakpoint (if None, midpoint of X series will be used)
timecol – name of time column (X)
ycol – name of the variable column (y)
groupcol – name of the group column (g) if not in df will be ignored
alpha – significance level (without Bonferroni correction), can be changed with Chow.set_alpha(0.01)
margin – range of indexes around breakpoint (or midpoint) where the best breakpoint will be searched. E.g. if X = [0,1,1.5,2,3,4,5], breakpoint = 2 and margin = 1, the set of breakpoints [1.5,2,3] will be tested. (set = 0 if you want to look only at the given breakpoint or the midpoint)

classify(**_)¶

Classify the dataset based on results of Chow test at significance level alpha.

Possible classification values:

No significant breakpoint:

N: non-significant overall trend
I: significant increasing overall trend
D: significant decreasing overall trend

Significant breakpoint (set1 and set2 indicate points before/after breakpoint):

NN: non-significant trend on set1 and non-significant trend on set2
NI: non-significant trend on set1 and significant increasing
trend on set2
ND: non-significant trend on set1 and significant decreasing
trend on set2
IN: significant increasing trend on set1 and non-significant
trend on set2
ID: significant increasing trend on set1 and significant
decreasing trend on set2
iI: significant increasing trend on both set1 and set2
with greater increase in set2
Ii: significant increasing trend on both set1 and set2
with greater increase in set1
DN: significant decreasing trend on set1 and non-significant
trend on set2
DI: significant decreasing trend on set1 and significant
increasing trend on set2
dD: significant decreasing trend on both set1 and set2 with
greater decrease in set2
Dd: significant decreasing trend on both set1 and set2 with
greater decrease in set1

Parameters:: kwargs – Additional keyword arguments (currently unused)
Returns:: Dictionary with model parameters

find_best_bkp(breakpoint_indices=None, **_)¶

Find the best breakpoint and run Chow test to find OLS parameters.

Tests multiple breakpoints and selects the one with the best score. Updates self.initial_breakpoint with the optimal value.

Parameters:

breakpoint_indices – Optional list of indices to test. If None, uses get_breakpoint_indices()
kwargs – Additional keyword arguments (currently unused)

get_breakpoint_indices()¶

Calculate and set the breakpoint indices to search.

Returns:: List of breakpoint indices to test

params()¶

Return parameters of the fitted model(s).

Returns:: Dictionary containing model0, model1, and model2 parameters. If breakpoint is significant, model0 is empty and model1/model2 are populated. Otherwise, model0 is populated and model1/model2 are empty.

params_names()¶

Get list of parameter names for model output.

Returns:: List of parameter name strings

plot(filename: str = None, ax=None, figsize=(16, 8), ylog: bool = False, fill: bool = True, scatter: bool = True, scatterparams: dict = None, fillparams: dict = None, fitparams: dict = None, linestyles: list = None, show_legend: bool = True, **params)¶

Plot the regression model(s).

Parameters:

filename – Filename (with extension format), if None, will call plt.show() instead
ax – Matplotlib axis on which to plot, if None will create new figure
figsize – Tuple with figure size
ylog – Make the y-axis log-scale
fill – Plot the confidence interval
scatter – Plot individual points
scatterparams – Parameters for the scatter plot
fillparams – Parameters for the confidence intervals
fitparams – Parameters for the fitted line
linestyles – Select the linestyle of each trend (list of size 3), set to None if you want to define it in fitparams
show_legend – Legend options passed to ax.legend()
params – Additional parameters including color, xlabel, ylabel, title, xlim, ylim

plot_by_group(filename: str = None, ax=None, figsize=(16, 8), cmap: str = 'Set1', show_legend: bool = True, plot_overall: bool = True, plot_individual_fill: bool = True, scatterparams: dict = None, fillparams: dict = None, fitparams: dict = None, groups_order: list = None, **params)¶

Plot the regression model(s) grouped by the group column.

Parameters:

filename – Filename (with extension format), if None, will call plt.show() instead
ax – Matplotlib axis on which to plot, if None will create new figure
figsize – Tuple with figure size
cmap – Colormap name (from matplotlib) or custom color mapping
show_legend – Include the legend
plot_overall – Plot overall trend and confidence interval
plot_individual_fill – Plot confidence interval for each individual group
scatterparams – Parameters for the scatter plot
fillparams – Parameters for the confidence intervals
fitparams – Parameters for the fitted line
groups_order – List showing the order in which the groups must be plotted
params – Additional parameters including xlabel, ylabel, title, ylog, scatter

run(**kwargs)¶

Run the complete Chow classification analysis.

Finds best breakpoint, runs Chow test, and classifies the trend.

Parameters:: kwargs – Keyword arguments forwarded to find_best_bkp and classify. Can include ‘normalise’ for run_chow, ‘alpha’ for classify.
Returns:: Classification code (string) or None if analysis fails

run_chow(bkp, normalise=False)¶

Run Chow test at specified breakpoint.

Fits OLS models to the full dataset and two subsets (before and after breakpoint), then calculates the F-statistic to test for structural break.

Parameters:

bkp – Breakpoint value at which to split the data
normalise – Whether to normalize the y values before fitting

Returns:

Dictionary containing score, F-statistic, and residual sum of squares

set_alpha(alpha: float)¶

Change the current statistical significance level.

The Bonferroni correction will be applied based on the number of tests.

Parameters:: alpha – Significance level, positive float in (0,1)

set_color()¶

Set the color attribute based on the trend type (tt).

Uses predefined color scheme from TREND_COLORS.

summary()¶

Print the statistical summary of the fitted model(s).

If breakpoint is significant, prints summaries for both model1 and model2. Otherwise, prints summary for model0 (whole dataset).

Module contents¶

Chow classifier package for time-series trend classification.