| Title: | ROC-Based Inference for Diagnostic Accuracy Under Verification Bias |
|---|---|
| Description: | Provides point estimates and confidence intervals for receiver operating characteristic (ROC)–based diagnostic accuracy metrics for tests and biomarkers subject to verification bias. Supported metrics include the Area Under the ROC Curve (AUC), the Youden index, and the sensitivity at a user‑specified specificity level for two‑class continuous tests under missing‑at‑random (MAR) disease verification. Point estimation follows Alonzo and Pepe (2005) <doi:10.1111/j.1467-9876.2005.00477.x>. Multiple types of confidence intervals are implemented and compared, including bootstrap‑based, Method of Variance Estimates Recovery (MOVER)–based, and empirical likelihood (EL)–based intervals; see Wang et al. (2025) <doi:10.1177/09622802251322989> and <https://github.com/swang1021/rocvb>. |
| Authors: | Shirui Wang [aut, cre] |
| Maintainer: | Shirui Wang <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-05 08:49:20 UTC |
| Source: | https://github.com/swang1021/rocvb |
Computes point estimates and confidence intervals for the AUC of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).
auc.ci.mar( Test, D, A, alpha = 0.05, search_step = 0.01, tol = 1e-05, precision = 1e-04, n.boot = 1000, plot = TRUE )auc.ci.mar( Test, D, A, alpha = 0.05, search_step = 0.01, tol = 1e-05, precision = 1e-04, n.boot = 1000, plot = TRUE )
Test |
Test results; a positive numeric vector. |
D |
Verified disease status; a logical vector with possible missing values. |
A |
Covariate; a positive numeric vector. Only one covariate is allowed. |
alpha |
Significance level for the confidence interval. Default is 0.05. |
search_step |
Step size used in root searching. Default is 0.01. |
tol |
Tolerance used in root searching. Default is 1e-5. |
precision |
Precision parameter used in the regression model. Default is 1e-4. |
n.boot |
Number of bootstrap replicates. Default is 1000. |
plot |
Logical; if |
Bootstrap and hybrid empirical likelihood confidence intervals for AUC under verification bias are computed.
The disease model is estimated using a probit regression model
linear in and based on verified subjects, given by
where denotes the standard normal cumulative distribution function.
The verification model is estimated using a logit regression model
linear in and based on all subjects, given by
where .
The function may also produce a density plot of the test measurements when plot = TRUE.
A list with elements:
n.totalTotal number of subjects.
n.caseNumber of verified diseased subjects.
n.controlNumber of verified non-diseased subjects.
p.missingProportion of missing verification.
pt.estPoint estimates of AUC.
BC.intervalsBootstrap classic (BC) confidence intervals.
BP.intervalsBootstrap percentile (BP) confidence intervals.
HEL1.intervalsHybrid empirical likelihood confidence intervals, type I.
HEL2.intervalsHybrid empirical likelihood confidence intervals, type II.
Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).
Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood inference for the area under the ROC curve with verification-biased data. Manuscript under peer review.
set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA auc.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA auc.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)
Computes point estimates and confidence intervals for sensitivity of a continuous test at a fixed level of specificity when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).
sen.ci.mar( Test, D, A, p, alpha = 0.05, search_step = 0.01, tol = 1e-05, precision = 1e-04, n.boot = 1000, plot = TRUE )sen.ci.mar( Test, D, A, p, alpha = 0.05, search_step = 0.01, tol = 1e-05, precision = 1e-04, n.boot = 1000, plot = TRUE )
Test |
Test results; a positive numeric vector. |
D |
Verified disease status; a logical vector with possible missing values. |
A |
Covariate; a positive numeric vector. Only one covariate is allowed. |
p |
Target specificity level; a number between 0 and 1. |
alpha |
Significance level for the confidence interval. Default is 0.05. |
search_step |
Step size used in root searching. Default is 0.01. |
tol |
Tolerance used in root searching. Default is 1e-5. |
precision |
Precision parameter used in the regression model. Default is 1e-4. |
n.boot |
Number of bootstrap replicates. Default is 1000. |
plot |
Logical; if |
The function targets sensitivity evaluated at specificity level p (i.e.,
sensitivity at the threshold achieving specificity p). Bootstrap,
hybrid empirical likelihood and influence function-based empirical likelihood
confidence intervals are computed as returned in the list.
The disease model is estimated using a probit regression model
linear in and based on verified subjects, given by
where denotes the standard normal cumulative distribution function.
The verification model is estimated using a logit regression model
linear in and based on all subjects, given by
where .
The function may also produce a density plot of the test measurements when plot = TRUE.
A list with elements:
n.totalTotal number of subjects.
n.caseNumber of verified diseased subjects.
n.controlNumber of verified non-diseased subjects.
p.missingProportion of missing verification.
pt.estPoint estimates of sensitivity at specificity p.
pt.est.acPoint estimates of sensitivity at specificity p using the Agresti–Coull method.
AC.intervalsAgresti–Coull-based confidence intervals.
WS.intervalsWilson score-based confidence intervals.
BTI.intervalsBootstrap confidence intervals, type I.
BTII.intervalsBootstrap confidence intervals, type II.
HEL1.intervalsHybrid empirical likelihood confidence intervals, type I.
HEL2.intervalsHybrid empirical likelihood confidence intervals, type II.
IFEL1.intervalsInfluence Function-based empirical likelihood confidence intervals, type I.
IFEL2.intervalsInfluence Function-based empirical likelihood confidence intervals, type II.
Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).
Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood-based confidence intervals for sensitivity of a continuous test at a fixed level of specificity with verification bias. Manuscript under peer review.
set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA sen.ci.mar(Test, D, A, p = 0.8, n.boot = 20, plot = FALSE)set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA sen.ci.mar(Test, D, A, p = 0.8, n.boot = 20, plot = FALSE)
Computes point estimates and confidence intervals for maximum Youden index of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).
yi.ci.mar( Test, D, A, alpha = 0.05, precision = 1e-04, n.boot = 1000, plot = TRUE )yi.ci.mar( Test, D, A, alpha = 0.05, precision = 1e-04, n.boot = 1000, plot = TRUE )
Test |
Test results; a positive numeric vector. |
D |
Verified disease status; a logical vector with possible missing values. |
A |
Covariate; a positive numeric vector. Only one covariate is allowed. |
alpha |
Significance level for the confidence interval. Default is 0.05. |
precision |
Precision parameter used in the regression model. Default is 1e-4. |
n.boot |
Number of bootstrap replicates. Default is 1000. |
plot |
Logical; if |
Bootstrap and MOVER-based confidence intervals are computed for the maximum Youden index.
The disease model is estimated using a probit regression model
linear in and based on verified subjects, given by
where denotes the standard normal cumulative distribution function.
The verification model is estimated using a logit regression model
linear in and based on all subjects, given by
where .
The function may also produce a density plot of the test measurements when plot = TRUE.
A list with elements:
n.totalTotal number of subjects.
n.caseNumber of verified diseased subjects.
n.controlNumber of verified non-diseased subjects.
p.missingProportion of missing verification.
pt.estPoint estimates of the maximum Youden index.
pt.est.acPoint estimates of the maximum Youden index using the Agresti–Coull method.
optimal.cutoffOptimal cutoff point of test results that maximizes the Youden index.
Wald.intervalsWald confidence intervals.
BCI.intervalsBootstrap classic confidence intervals, type I.
BCII.intervalsBootstrap classic confidence intervals, type II.
BPac.intervalsBootstrap percentile confidence intervals.
MOVERac.intervalsMOVER confidence intervals using the Agresti–Coull method.
MOVERws.intervalsMOVER confidence intervals using the Wilson score method.
Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).
Wang, S., Shi, S., and Qin, G. (2025). Interval estimation for the Youden index of a continuous diagnostic test with verification biased data. Statistical Methods in Medical Research.
set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA yi.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)set.seed(123) Test <- abs(rnorm(100)) A <- abs(rnorm(100)) D <- as.logical(Test + A > stats::quantile(Test + A, 0.8)) D[sample(100, 30)] <- NA yi.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)