Cluster-robust inference is widely used in modern empirical work in economics and many other disciplines. The key unit of observation is the cluster. We propose measures of "high-leverage'' clusters and "influential'' clusters for linear regression models. The measures of leverage and partial leverage, and functions of them, can be used as diagnostic tools to identify datasets and regression designs in which cluster-robust inference is likely to be challenging. The measures of influence can provide valuable information about how the results depend on the data in the various clusters. We also show how to calculate two jackknife variance matrix estimators, CV3 and CV3J, as a byproduct of our other computations. All these quantities, including the jackknife variance estimators, are computed in a new Stata package called summclust that summarizes the cluster structure of a dataset.
QED Working Paper Number
1483
clustered data
cluster-robust variance estimator
grouped data
high-leverage clusters
influential clusters
jackknife
partial leverage
robust inference
Download [PDF]
(342.7 KB)