Quantifying R2 bias in the presence of measurement error
Journal of Applied Statistics
Measurement error (ME) is the difference between the true unknown value of a variable and the data assigned to that variable during the measuring process. The multiple correlation coefficient quantifies the strength of the relationship between the dependent and independent variable(s) in regression modeling. In this paper, we show that ME in the dependent variable results in a negative bias in the multiple correlation coefficient, making the relationship appear weaker than it should. The adjusted R 2 provides regression modelers an unbiased estimate of the multiple correlation coefficient. However, due to the ME induced bias in the multiple correlation coefficient, the otherwise unbiased adjusted R 2 under-estimates the variance explained by a regression model. This paper proposes two statistics for estimating the multiple correlation coefficient, both of which take into account the ME in the dependent variable. The first statistic uses all unbiased estimators, but may produce values outside the [0,1] interval. The second statistic requires modeling a single data set, created by including descriptive variables on the subjects used in a gage study. Based on sums of squares, the statistic has the properties of an R 2: it measures the proportion of variance explained; has values restricted to the [0,1] interval; and the endpoints indicate no variance explained and all variance explained respectively. We demonstrate the methodology using data from a study of cervical spine range of motion in children.
Print ISSN: 0266-4763 Online ISSN: 1360-0532
Copyright © 2018 Informa UK Limited
Majeske, Karl D.; Lynch-Caris, Terri; and Brelin-Fornari, Janet, "Quantifying R2 bias in the presence of measurement error" (2010). Crash Safety Center Publications. 10.