各种分组回归方法

各种分组回归方法

本文总结了各种常用和不常用的分组回归命令

1
2
3
4
5
6
7
8
* 需要安装 runby
ssc install runby
// 后面发现好像可以不用 runby,但是我的 Stata 需要 ssc install egenmore
ssc install egenmore

* 测试数据集
webuse nlswork
xtset idcode year

1 全样本中位数分组

1
2
3
4
5
egen med = median(age)
gen gvar = (age >= med) if !mi(age)

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

2 按照同年份中位数分组

1
2
3
4
5
bys year: egen med = median(age)
gen gvar = (age >= med) if !mi(age)

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

3 按照同行业同年份中位数分组

1
2
3
4
5
bys ind_code year: egen med = median(age)
gen gvar = (age >= med) if !mi(age)

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

4 按照某一指定年份中位数分组

1
2
3
4
5
6
7
8
9
10
11
gen dc = age
gen temp = dc if year == 2014
bys code: egen gvar_in_year = min(temp)
egen gvar_in_year_median = median(gvar_in_year)
gen gvar = (gvar_in_year > gvar_in_year_median) if !mi(gvar_in_year)

* 更建议先在原始文件里先处理成指定年份再 merge,然后直接用全样本中位数即可
* 不建议先在原始文件里分组再 merge,可能会导致两组样本差异很大

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

5 非平衡面板下,按照每个样本的第一年分组

1
2
3
4
5
6
7
8
9
gen temp = age
bys idcode: gen order_num = _n
gen temp_in_1 = temp if order_num == 1
bys idcode: egen tempall = min(temp_in_1)
egen med = median(tempall)
gen gvar = (tempall >= med) if !mi(tempall)

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

6 全样本分三组(多组),取高低两组对比

1
2
3
4
5
6
7
xtile gvar = age, nq(3)
drop if gvar == 2
replace gvar = 0 if gvar == 1
replace gvar = 1 if gvar == 3

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

7 逐年份将样本分为三组,取高低两组对比

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
* 方法一:runby
cap program drop myxtile
program define myxtile
xtile gvar = age, nq(3)
end
runby myxtile, by(year) verbose

* 方法二:egenmore
bys year: egen gvar = xtile(age), n(3)

drop if gvar == 2
replace gvar = 0 if gvar == 1
replace gvar = 1 if gvar == 3

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

8 逐行业-年份将样本分为三组,取高低两组对比

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
* 方法一:runby
cap program drop myxtile
program define myxtile
xtile gvar = age, nq(3)
end
runby myxtile, by(ind_code year) verbose

* 方法二:egenmore
bys ind_code year: egen gvar = xtile(age), n(3)

drop if gvar == 2
replace gvar = 0 if gvar == 1
replace gvar = 1 if gvar == 3

reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 0
reghdfe ln_wage tenure ttl_exp, a(idcode year) vce(cl idcode), if gvar == 1

9 系数差异检验

后面就是系数差异检验尽情发挥了

chowtest

bdiff

作者

CodeFox

发布于

2024-01-08

更新于

2024-06-05

许可协议

评论