Lab 12: Causal Forests

## Research question Does receiving financial aid reduces the probability of a convict being re-arrested after a release? This data set is originally from Rossi et al. (1980, Money, Work, and Crime: Some Experimental Results.). Observations are 432 convicts who were released from Maryland state prisons in the 1970s and who were followed up for one year after release. Half the released convicts were assigned at random to an experimental treatment in which they were given financial aid. It is important to assess if financial aid helps with re-integration into the society. It is also important to assess the magnitude of the effect as financial aid programs can be costly to tax-payers. If the effect of the program can be heterogeneous, it is important to identify sub-populations where the program is more effective. This would allow policy makers to target released convicts for whom the effect is expected to be more substantial. ## Data preparation The required package is "carData" for the Rossi dataset. ```{r, warning=FALSE, message=FALSE} library(carData) ``` Let's construct `Wks`: percentage of time unemployed before re-arrest or until the end of the period: ```{r} N=nrow(Rossi) Rossi$Weeks_worked=0 for (i in 1:N){ last_week_col=Rossi$week[i]+10 #last week before arrest column number for (j in 11:last_week_col){ Rossi$Weeks_worked[i]=Rossi$Weeks_worked[i]+(Rossi[i,j]=="yes") } } Rossi$Wks=(Rossi$week-Rossi$Weeks_worked)/Rossi$week ``` Creat a 0/1 treatment variable: ```{r} Rossi$D=(Rossi$fin=="yes")+0 ``` Education is a factor: ```{r, message=F} Rossi$Grades=" " Rossi$Grades[Rossi$educ==2]<-"_6" Rossi$Grades[Rossi$educ==3]<-"7_9" Rossi$Grades[Rossi$educ==4]<-"10_11" Rossi$Grades[Rossi$educ==5]<-"12" Rossi$Grades[Rossi$educ==6]<-"College" ``` The data set: ```{r} RData=subset(Rossi,select=c(arrest,D,age,race,wexp,mar,paro,prio,Wks)) RData$educ<-as.factor(Rossi$Grades) ``` ## Estimate the ATE ### Compute the simple ATE estimator The treatment was randomly assigned. We should be able to estimate ATE by just regressing `arrest` against `D`. The estimated ATE of financial aid is 8.3% reduction in probability of being re-arrested. We calculate the heteroskedasticity robust standard errors and find that the result is significant. ```{r, warning=FALSE} library(AER) library(sandwich) ATEsimple=lm(arrest~D,data=RData) coeftest(ATEsimple,vcov=vcovHC(ATEsimple, type = "HC")) ``` ### Compute the IPW estimator Since the treatment is randomized, there are not supposed to be controls explaining the treatment assignment. Let's check using the logit model: ```{r, warning=F} logit=glm(D~(age+race+wexp+mar+paro+prio+educ+Wks)^2,data=RData,family = binomial(link = "logit")) coeftest(logit,vcov=vcovHC(logit,type="HC0")) ``` We find that some of the individual p-values are significant, possibly due to the multiple hypotheses testing problem, even if the treatment is truely randomized. However, it is possible that treatment assignment is not completely random. For example, convicts with some college seems to be more likely to be in the program. We also compute the IPW estimator of ATE using estimated propensity scores. Let's construct a function that computes the IPW ATE estimator. We will estimate the propensity score by logit as above. ```{r} ipw.ate<-function(data,indices){ #indices are needed later for bootstrap samples construction DATA=data[indices,] # Logit for propensity score logit=glm(D~(age+race+wexp+mar+paro+prio+educ+Wks)^2,data=DATA,family = binomial(link = "logit")) # predicted probabilities = estimated propensity score p=predict(logit,type="response") #Trim predictions close to zero or one ind=which(p>0.05 & p<0.95) p1=p[ind] DATA1=DATA[ind,] # The IPW ATE estimator Diff=(DATA1$arrest*DATA1$D)/p1 - (DATA1$arrest*(1-DATA1$D)/(1-p1)) IPW_ATE=mean(Diff) return(IPW_ATE) } ``` Let's apply the function to our data set: ```{r, warning=FALSE} ind=seq(1:nrow(RData)) IPW_ATE=ipw.ate(RData,indices=ind) IPW_ATE ``` We use the bootstrap for inference. We use 399 bootstrap samples: ```{r,warning=F} library(boot) set.seed(6064,sample.kind='Rejection') IPW_ATE_boot=boot(data=RData,statistic=ipw.ate,R=399) IPW_ATE_boot ``` Similar results using the "causalweight" package: ```{r, warning=FALSE} library(causalweight) X=model.matrix(D~(age + race + wexp + mar + paro + prio + educ + Wks)^2,data=RData)[,-1] Y=RData$arrest D=RData$D output=treatweight(d=D,y=Y,x=X,logit=TRUE,trim=0.05,boot=399) output ``` ### Estimate CATE by causal forest ```{r} library(grf) ?causal_forest ``` Generate the output: ```{r} X=model.matrix(D~age+race+mar+paro+prio+educ+Wks,data=RData)[,-1] Y=RData$arrest W=RData$D CF=causal_forest(X,Y,W,seed = 6064) ``` Compute variables' importance. We find that variables that have strongest explanatory power are `Wks`, `age` and `prio`. * `Wks`: percentage of time unemployed before re-arrest or until the end of the period; * `age`: age at time of release; * `prio`: number of convictions prior to current incarceration. ```{r} CF_var_imp<-variable_importance(CF) selected.vars <- which(CF_var_imp > mean(CF_var_imp)) Importance=round(CF_var_imp[selected.vars],digits=3) names(Importance)<-colnames(X)[selected.vars] sort(Importance, decreasing=TRUE) ``` Compute CATE for `age>=27` and `age<27`. We find that the program is effective for older convicts. ```{r} average_treatment_effect(CF,target.sample = "all",subset=(RData$age>=27)) average_treatment_effect(CF,target.sample = "all",subset=(RData$age<27)) ``` Compute CATE for `Wks>=0.85` and `Wks$<0.85`. We find a stronger effect for those who are less employed. ```{r} average_treatment_effect(CF,target.sample = "all",subset=(RData$Wks>=0.85)) average_treatment_effect(CF,target.sample = "all",subset=(RData$Wks<0.85)) ``` ```{r} CATE_Wks=average_treatment_effect(CF,target.sample = "all",subset=(RData$Wks>=0.85)) Estimate=CATE_Wks[1] Std.err=CATE_Wks[2] T=abs(Estimate/Std.err) pvalue=(2*(1-pnorm(T))) cat("CATE for Wks>0.85: ",round(Estimate,digits=3), " Std.err: ",round(Std.err,digits=3), " p-value: ",round(pvalue,digits=3)) ``` Compute CATE by prior convictions. There seems to be a stronger effect for those with multiple prior convictions ```{r} average_treatment_effect(CF,target.sample = "all",subset=(RData$prio>1)) average_treatment_effect(CF,target.sample = "all",subset=(RData$prio<=1)) ``` ```{r} CATE_prio=average_treatment_effect(CF,target.sample = "all",subset=(RData$prio>1)) Estimate=CATE_prio[1] Std.err=CATE_prio[2] T=abs(Estimate/Std.err) pvalue=(2*(1-pnorm(T))) cat("CATE for prio>1 : ",round(Estimate,digits=3), " Std.err: ",round(Std.err,digits=3), " p-value: ",round(pvalue,digits=3)) ``` Compute CATE by marital status. The program might be more effective for single convicts. ```{r} average_treatment_effect(CF,target.sample = "all",subset=(RData$mar=="married")) average_treatment_effect(CF,target.sample = "all",subset=(RData$mar=="not married")) ``` We can combine several conditions. We can find that the program is more effective for older single convicts with multiple prior convictions. They are less likely to have family support. ```{r} CATE_mult=average_treatment_effect(CF,target.sample = "all",subset= (RData$age>=27 & RData$mar=="not married" & RData$prio>1)) Estimate=CATE_mult[1] Std.err=CATE_mult[2] T=abs(Estimate/Std.err) pvalue=(2*(1-pnorm(T))) cat("CATE for age>=27, not married, with prio>1 : ",round(Estimate,digits=3)," Std.err: ",round(Std.err,digits=3), " p-value: ",round(pvalue,digits=3)) ``` Compute CATE by the three most important variables. We find that the program is more effective for employed at least 60% of the time older convicts with multiple prior convictions. ```{r} CATE_import=average_treatment_effect(CF,target.sample = "all",subset= (RData$Wks<0.40 & RData$age>=27 & RData$prio>1)) Estimate=CATE_import[1] Std.err=CATE_import[2] T=abs(Estimate/Std.err) pvalue=(2*(1-pnorm(T))) cat("CATE for Wks<0.4, age>=27, with prio>1 : ",round(Estimate,digits=3)," Std.err: ",round(Std.err,digits=3), " p-value: ",round(pvalue,digits=3)) ``` Compute the confidence interval. We find that the program can be highly effective: up to 65% reduction in the probability of re-arrest for properly chosen subjects who keep employment, are older, and have multiple prior convictions. ```{r} c(Estimate-1.96*Std.err,Estimate+1.96*Std.err) ```