Missing values are a major problem in all econometric applications based on survey data. A standard approach assumes data are missing-at-random and uses imputation methods, or even listwise deletion. This approach is justified if item non-response does not depend on the potentially missing variables’ realization. However, assuming missing-at-random may introduce bias if non-response is, in fact, selective. Relevant applications range from financial or strategic firm-level data to individual-level data on income or privacy-sensitive behaviors.

In this paper, we propose a novel approach to deal with selective item nonresponse in the model’s dependent variable. Our approach is based on instrumental variables that affect selection only through potential outcomes. In addition, we allow for endogenous regressors. We establish identification of the structural parameter and propose a simple two-step estimation procedure for it. Our estimator is consistent and robust against biases that would prevail when assuming missingness at random. We implement the estimation procedure using firm-level survey data and a binary instrumental variable to estimate the effect of outsourcing on productivity.


endogenous selection, IV-estimation, inverse probability weighting, missing data, productivity, outsourcing, semiparametric estimation