Testing for the appropriate level of clustering in linear regression models

QED Working Paper Number
1428

Reliable inference with clustered data has received a great deal of attention in recent years. The overwhelming majority of this research assumes that the cluster structure is known. This assumption is very strong, because there are often several possible ways in which a dataset could be clustered. We propose two tests for the correct level of clustering. One test focuses on inference about a single coefficient, and the other on inference about two or more coefficients. We also prove the asymptotic validity of a wild bootstrap implementation. The proposed tests work for a null hypothesis of either no clustering or "fine'' clustering against alternatives of "coarser'' clustering. We also propose a sequential testing procedure to determine the appropriate level of clustering. Simulations suggest that the bootstrap tests perform very well under the null hypothesis and can have excellent power. An empirical example suggests that using our tests leads to sensible inferences.

JEL Codes

Keywords

CRVE
grouped data
clustered data
cluster-robust variance estimator
robust inference
wild bootstrap
wild cluster bootstrap

Working Paper

Download [PDF] (536.02 KB)