Chow-Test – Wikipedia

before-content-x4

The Chow-Test is a statistical test with which the coefficients of two linear regressions can be tested for equality. The test is named after its inventor, the economist Gregory Chow.

after-content-x4

The CHOW test is used in economics to test time series on structural fractures. Another area of ​​application is the programvaluation, here two different sub -groups (programs), such as two school types, are compared. In contrast to the time series analysis, the two subgroups cannot be assigned to any consecutive intervals, instead the division is made according to a qualitative aspect, such as the school type.

There is a data set

( AND i , X i ) {displaystyle (Y_{i},X_{i})}

with

X i = ( x i first , , x i k ) {displaystyle X_{i}=(x_{i1},ldots ,x_{ik})}

for

i = first N {displaystyle i=1ldots N}

whose relationship through a linear function with a normal distributed error (

ϵ {displaystyle epsilon }

) with expectation value 0 (

AND ( ϵ ) = 0 {displaystyle E(epsilon )=0}

) is described (multiple regression analysis), i.e. H. One has

after-content-x4

However, one suspects that the data record is in two groups of the sizes

N a {displaystyle N_{a}}

and

N b {displaystyle N_{b}}

Division, which are better described by two different linear functions.

Here is

N = N a + N b {Displaystyle n = n_ {a}+n_ {b}}

And it becomes the hypothesis

H 0 : ( a 0 , a first , , a k ) = ( b 0 , b first , , b k ) {displaystyle H_{0}colon (a_{0},a_{1},ldots ,a_{k})=(b_{0},b_{1},ldots ,b_{k})}

against

H first : ( a 0 , a first , , a k ) ( b 0 , b first , , b k ) {displaystyle H_{1}colon (a_{0},a_{1},ldots ,a_{k})neq (b_{0},b_{1},ldots ,b_{k})}

tested. If one refers to the sum of the square residues of the regression over the entire data set

S {displaystyle S}

and with the two sub -groups with

S a {displaystyle S_{a}}

and

S b {displaystyle S_{b}}

, then the test size defined below follows

T {displaystyle T}

an F distribution with the degrees of freedom

k + first {displaystyle k+1}

and

N a + N b 2 ( k + first ) {Displaystyle n_ {a}+n_ {b} -2 (k+1)}

.

The following data record is given, the relationship of which is due to the linear function

AND = c 0 + c first X {displaystyle Y=c_{0}+c_{1}X}

should be modeled:

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
−0.043 0.435 0.149 0.252 0.571 0.555 0.678 3,119 2.715 3.671 3.928 3,962

The data plot sets a structural fracture

A data plot suggests that at

x = 4 {displaystyle x=4}

There is a structure of a structural fraction, so the data record is divided into 2 intervals

[ 0 , 5 ; 3 , 5 ] {Displaystyle [0 {,} 5; 3 {,} 5]}

and

[ 4 , 0 ; 6 , 0 ] {Displaystyle [4 {,} 0; 6 {,} 0]}}

and runs separate regressions via this, in addition to regression over the entire data record. Then you test whether the two sub -regressions create the same linear function, i.e.

H 0 : ( a 0 , a first ) = ( b 0 , b first ) {displaystyle H_{0}colon (a_{0},a_{1})=(b_{0},b_{1})}

against

H first : ( a 0 , a first ) ( b 0 , b first ) {displaystyle H_{1}colon (a_{0},a_{1})neq (b_{0},b_{1})}

Regression on the entire data record:

Regression on

[ 0 , 5 , 3 , 5 ] {Displaystyle [0 {,} 5.3 {,} 5]}

Data plot with regression line

Regression on

[ 4 , 0 , 6 , 0 ] {Displaystyle [4 {,} 0.6 {,} 0]}

Calculation of the test size:

Because of

F 2 ; 8 ; 0 , 95 = 4,459 {displaystyle F_{2;8;0,95}=4{,}459,}

(Significance level

a = 0 , 05 {displaystyle alpha =0{,}05,}

) is applicable

T F 2 ; 8 ; 0 , 95 {displaystyle Tgeq F_{2;8;0,95}}

. So the null hypothesis can

H 0 {displaystyle H_{0},}

be rejected. This means that the two regression straight on the partial intervals are not identical. So there is a structure of the structure and the partial regressions provide better modeling than regression over the entire data set.

after-content-x4