Quantifying R2 bias in the presence of measurement error

Document Type

Article

Publication Date

4-1-2010

Publication Title

Journal of Applied Statistics

Abstract

Measurement error (ME) is the difference between the true unknown value of a variable and the data assigned to that variable during the measuring process. The multiple correlation coefficient quantifies the strength of the relationship between the dependent and independent variable(s) in regression modeling. In this paper, we show that ME in the dependent variable results in a negative bias in the multiple correlation coefficient, making the relationship appear weaker than it should. The adjusted R 2 provides regression modelers an unbiased estimate of the multiple correlation coefficient. However, due to the ME induced bias in the multiple correlation coefficient, the otherwise unbiased adjusted R 2 under-estimates the variance explained by a regression model. This paper proposes two statistics for estimating the multiple correlation coefficient, both of which take into account the ME in the dependent variable. The first statistic uses all unbiased estimators, but may produce values outside the [0,1] interval. The second statistic requires modeling a single data set, created by including descriptive variables on the subjects used in a gage study. Based on sums of squares, the statistic has the properties of an R 2: it measures the proportion of variance explained; has values restricted to the [0,1] interval; and the endpoints indicate no variance explained and all variance explained respectively. We demonstrate the methodology using data from a study of cervical spine range of motion in children.

Volume

37

Issue

4

First Page

667

Last Page

677

DOI

doi: 10.1080/02664760902814542

ISSN

Print ISSN: 0266-4763 Online ISSN: 1360-0532

Rights

Copyright © 2018 Informa UK Limited

Share

COinS