Statistics 203: Introduction to Regression and Analysis of Variance Robust methods Jonathan Taylor Today’s class Heteroskedasticity MLE for one sample problem Weighted least squares Estimating ˙2 Weighted regression example. Weighted regression. Robust methods.
- Volume 8, Number 1 (2014), 646-677.
Monitoring robust regression
Marco Riani, Andrea Cerioli, Anthony C. Atkinson, and Domenico Perrotta
More by Marco Riani
Search this author in:
More by Andrea Cerioli
Search this author in:
More by Anthony C. Atkinson
Search this author in:
More by Domenico Perrotta
Search this author in:
Abstract
Robust methods are little applied (although much studied by statisticians). We monitor very robust regression by looking at the behaviour of residuals and test statistics as we smoothly change the robustness of parameter estimation from a breakdown point of 50% to non-robust least squares. The resulting procedure provides insight into the structure of the data including outliers and the presence of more than one population. Monitoring overcomes the hindrances to the routine adoption of robust methods, being informative about the choice between the various robust procedures. Methods tuned to give nominal high efficiency fail with our most complicated example. We find that the most informative analyses come from S estimates combined with Tukey’s biweight or with the optimal $rho$ functions.
For our major example with 1,949 observations and 13 explanatory variables, we combine robust S estimation with regression using the forward search, so obtaining an understanding of the importance of individual observations, which is missing from standard robust procedures. We discover that the data come from two different populations. They also contain six outliers.
Our analyses are accompanied by numerous graphs. Algebraic results are contained in two appendices, the second of which provides useful new results on the absolute odd moments of elliptically truncated multivariate normal random variables.
Article information
Source
Electron. J. Statist., Volume 8, Number 1 (2014), 646-677.
Electron. J. Statist., Volume 8, Number 1 (2014), 646-677.
Dates
First available in Project Euclid: 20 May 2014
First available in Project Euclid: 20 May 2014
Permanent link to this document
https://projecteuclid.org/euclid.ejs/1400592267
https://projecteuclid.org/euclid.ejs/1400592267
Digital Object Identifier
doi:10.1214/14-EJS897
doi:10.1214/14-EJS897
Mathematical Reviews number (MathSciNet)
MR3211027
MR3211027
Zentralblatt MATH identifier
1348.62200
1348.62200
Subjects
Primary: 62J05: Linear regression62J20: Diagnostics62G35: Robustness
Secondary: 62P20: Applications to economics [See also 91Bxx]
Primary: 62J05: Linear regression62J20: Diagnostics62G35: Robustness
Secondary: 62P20: Applications to economics [See also 91Bxx]
Keywords
Forward searchgraphical methodsleast trimmed squaresoutliersregression diagnosticsrho functionS estimationtruncated normal distribution
Forward searchgraphical methodsleast trimmed squaresoutliersregression diagnosticsrho functionS estimationtruncated normal distribution
Citation
Riani, Marco; Cerioli, Andrea; Atkinson, Anthony C.; Perrotta, Domenico. Monitoring robust regression. Electron. J. Statist. 8 (2014), no. 1, 646--677. doi:10.1214/14-EJS897. https://projecteuclid.org/euclid.ejs/1400592267
Format:
Delivery Method:
DownloadEmailEmail sent.
References
- Andrews, D. F., Bickel, P. J., Hampel, F. R., Tukey, W. J., and Huber, P. J. (1972). Robust Estimates of Location: Survey and Advances. Princeton University Press, Princeton, NJ.
- Atkinson, A. C. and Riani, M. (2000). Robust Diagnostic Regression Analysis. Springer-Verlag, New York.
- Atkinson, A. C. and Riani, M. (2002). Forward search added variable $t$ tests and the effect of masked outliers on model selection. Biometrika, 89, 939–946.
- Atkinson, A. C., Riani, M., and Cerioli, A. (2010). The forward search: Theory and data analysis (with discussion). Journal of the Korean Statistical Society, 39, 117–134. doi:10.1016/j.jkss.2010.02.007.
- Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations (with discussion). Journal of the Royal Statistical Society, Series B, 26, 211–246.
- Cerioli, A., Farcomeni, A., and Riani, M. (2014). Strong consistency and robustness of the Forward Search estimator of multivariate location and scatter. Journal of Multivariate Analysis, 126, 167–183.
- Croux, C. and Rousseeuw, P. J. (1992). A class of high-breakdown scale estimators based on subranges. Communications in Statistics – Theory and Methods, 21, 1935–1951.
- Croux, C., Dhaene, G., and Hoorelbeke, D. (2004). Robust standard errors for robust estimators. CES – Discussion paper series OR 0367, Department of Applied Economics, KU Leuven.
- Hampel, F. R. (1975). Beyond location parameters: Robust concepts and methods. Bulletin of the International Statistical Institute, 46, 375–382.
- Hampel, F. R., Rousseeuw, P. J., and Ronchetti, E. (1981). The change-of-variance curve and optimal redescending M-estimators. Journal of the American Statistical Association, 76, 643–648.
- Hawkins, D. M. and Olive, D. J. (2002). Inconsistency of resampling algorithms for high-breakdown regression estimators and a new algorithm (with discussion). Journal of the American Statistical Association, 97, 136–159.
- Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and Monte Carlo. Annals of Statistics, 1, 799–821.
- Huber, P. J. and Ronchetti, E. M. (2009). Robust Statistics, Second Edition. Wiley, New York.
- Koller, M. and Stahel, W. A. (2011). Sharpening Wald-type inference in robust regression for small samples. Computational Statistics and Data Analysis, 55, 2504–2515.
- Kotz, S., Balakrishnan, N., and Johnson, N. L. (2000). Continuous Multivariate Distributions – 1, 2nd Edition. Wiley, New York.
- Maronna, R. A., Martin, R. D., and Yohai, V. J. (2006). Robust Statistics: Theory and Methods. Wiley, Chichester.
- Riani, M., Perrotta, D., and Torti, F. (2012). FSDA: A MATLAB toolbox for robust analysis and interactive data exploration. Chemometrics and Intelligent Laboratory Systems, 116, 17–32. doi:10.1016/j.chemolab.2012.03.017.
- Riani, M., Atkinson, A. C., and Perrotta, D. (2014a). The forward search algorithm for very robust regression. (Submitted).
- Riani, M., Cerioli, A., and Torti, F. (2014b). On consistency factors and efficiency of robust S-estimators. TEST. (In press). doi:10.1007/S11749-014- 0357-7.
- Riani, M., Atkinson, A. C., and Perrotta, D. (2014c). A parametric framework for the comparison of methods of very robust regression. Statistical Science, 29, 128–143. doi:10.1214/13-STS437.
- Riani, M., Cerioli, A., Atkinson, A. C., and Perrotta, D. (2014d). Supplement to “Monitoring robust regression”. doi:10.1214/14-EJS897SUPP.
- Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79, 871–880.
- Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. Wiley, New York.
- Rousseeuw, P. J. and Yohai, V. J. (1984). Robust regression by means of S-estimators. In Robust and Nonlinear Time Series Analysis: Lecture Notes in Statistics 26, pages 256–272. Springer Verlag, New York.
- Stigler, S. M. (1989). Francis Galton’s account of the invention of correlation. Statistical Science, 4, 73–79.
- Stigler, S. M. (2010). The changing history of robustness. The American Statistician, 64, 277–281.
- Tallis, G. M. (1963). Elliptical and radial truncation in normal samples. Annals of Mathematical Statistics, 34, 940–944.
- Yohai, V. J. (1987). High breakdown-point and high efficiency estimates for regression. The Annals of Statistics, 15, 642–656.
- Yohai, V. J. and Zamar, R. H. (1988). High breakdown-point estimates of regression by means of the minimization of an efficient scale. Journal of the American Statistical Association, 83, 406–413.
- Yohai, V. J. and Zamar, R. H. (1997). Optimal locally robust M-estimates of regression. Journal of Statistical Planning and Inference, 64(2), 309–323.
Supplemental materials
- Supplementary material: Bank Data. The supplement provides an Excel file of the Bank Data described in Appendix C and Table 3 of our paper.Digital Object Identifier: doi:10.1214/14-EJS897SUPP