Package 'rocvb' reference manual

Title:	ROC-Based Inference for Diagnostic Accuracy Under Verification Bias
Description:	Provides point estimates and confidence intervals for receiver operating characteristic (ROC)–based diagnostic accuracy metrics for tests and biomarkers subject to verification bias. Supported metrics include the Area Under the ROC Curve (AUC), the Youden index, and the sensitivity at a user‑specified specificity level for two‑class continuous tests under missing‑at‑random (MAR) disease verification. Point estimation follows Alonzo and Pepe (2005) <doi:10.1111/j.1467-9876.2005.00477.x>. Multiple types of confidence intervals are implemented and compared, including bootstrap‑based, Method of Variance Estimates Recovery (MOVER)–based, and empirical likelihood (EL)–based intervals; see Wang et al. (2025) <doi:10.1177/09622802251322989> and <https://github.com/swang1021/rocvb>.
Authors:	Shirui Wang [aut, cre]
Maintainer:	Shirui Wang <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2026-06-05 08:49:20 UTC
Source:	https://github.com/swang1021/rocvb

Confidence Intervals for AUC Under MAR Verification

Description

Computes point estimates and confidence intervals for the AUC of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

auc.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)
auc.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

alpha

Significance level for the confidence interval. Default is 0.05.

search_step

Step size used in root searching. Default is 0.01.

tol

Tolerance used in root searching. Default is 1e-5.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

Bootstrap and hybrid empirical likelihood confidence intervals for AUC under verification bias are computed.

The disease model $\rho$ is estimated using a probit regression model linear in $Test$ and $A$ based on verified subjects, given by

$\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.$

where $\Phi$ denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in $Test$ and $A$ based on all subjects, given by

$\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,$

where $\pi_i = P(V_i = 1 \mid T_i, A_i)$ .

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total: Total number of subjects.
n.case: Number of verified diseased subjects.
n.control: Number of verified non-diseased subjects.
p.missing: Proportion of missing verification.
pt.est: Point estimates of AUC.
BC.intervals: Bootstrap classic (BC) confidence intervals.
BP.intervals: Bootstrap percentile (BP) confidence intervals.
HEL1.intervals: Hybrid empirical likelihood confidence intervals, type I.
HEL2.intervals: Hybrid empirical likelihood confidence intervals, type II.

References

Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).

Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood inference for the area under the ROC curve with verification-biased data. Manuscript under peer review.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
auc.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)
set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
auc.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)

Confidence Intervals for Sensitivity at Fixed Level of Specificity Under MAR Verification

Description

Computes point estimates and confidence intervals for sensitivity of a continuous test at a fixed level of specificity when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

sen.ci.mar(
  Test,
  D,
  A,
  p,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)
sen.ci.mar(
  Test,
  D,
  A,
  p,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

p

Target specificity level; a number between 0 and 1.

alpha

Significance level for the confidence interval. Default is 0.05.

search_step

Step size used in root searching. Default is 0.01.

tol

Tolerance used in root searching. Default is 1e-5.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

The function targets sensitivity evaluated at specificity level p (i.e., sensitivity at the threshold achieving specificity p). Bootstrap, hybrid empirical likelihood and influence function-based empirical likelihood confidence intervals are computed as returned in the list.

The disease model $\rho$ is estimated using a probit regression model linear in $Test$ and $A$ based on verified subjects, given by

$\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.$

where $\Phi$ denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in $Test$ and $A$ based on all subjects, given by

$\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,$

where $\pi_i = P(V_i = 1 \mid T_i, A_i)$ .

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total: Total number of subjects.
n.case: Number of verified diseased subjects.
n.control: Number of verified non-diseased subjects.
p.missing: Proportion of missing verification.
pt.est: Point estimates of sensitivity at specificity p.
pt.est.ac: Point estimates of sensitivity at specificity p using the Agresti–Coull method.
AC.intervals: Agresti–Coull-based confidence intervals.
WS.intervals: Wilson score-based confidence intervals.
BTI.intervals: Bootstrap confidence intervals, type I.
BTII.intervals: Bootstrap confidence intervals, type II.
HEL1.intervals: Hybrid empirical likelihood confidence intervals, type I.
HEL2.intervals: Hybrid empirical likelihood confidence intervals, type II.
IFEL1.intervals: Influence Function-based empirical likelihood confidence intervals, type I.
IFEL2.intervals: Influence Function-based empirical likelihood confidence intervals, type II.

References

Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood-based confidence intervals for sensitivity of a continuous test at a fixed level of specificity with verification bias. Manuscript under peer review.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
sen.ci.mar(Test, D, A, p = 0.8, n.boot = 20, plot = FALSE)

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
sen.ci.mar(Test, D, A, p = 0.8, n.boot = 20, plot = FALSE)

Confidence Intervals for Youden Index Under MAR Verification

Description

Computes point estimates and confidence intervals for maximum Youden index of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

yi.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)
yi.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

alpha

Significance level for the confidence interval. Default is 0.05.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

Bootstrap and MOVER-based confidence intervals are computed for the maximum Youden index.

The disease model $\rho$ is estimated using a probit regression model linear in $Test$ and $A$ based on verified subjects, given by

$\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.$

where $\Phi$ denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in $Test$ and $A$ based on all subjects, given by

$\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,$

where $\pi_i = P(V_i = 1 \mid T_i, A_i)$ .

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total: Total number of subjects.
n.case: Number of verified diseased subjects.
n.control: Number of verified non-diseased subjects.
p.missing: Proportion of missing verification.
pt.est: Point estimates of the maximum Youden index.
pt.est.ac: Point estimates of the maximum Youden index using the Agresti–Coull method.
optimal.cutoff: Optimal cutoff point of test results that maximizes the Youden index.
Wald.intervals: Wald confidence intervals.
BCI.intervals: Bootstrap classic confidence intervals, type I.
BCII.intervals: Bootstrap classic confidence intervals, type II.
BPac.intervals: Bootstrap percentile confidence intervals.
MOVERac.intervals: MOVER confidence intervals using the Agresti–Coull method.
MOVERws.intervals: MOVER confidence intervals using the Wilson score method.

References

Wang, S., Shi, S., and Qin, G. (2025). Interval estimation for the Youden index of a continuous diagnostic test with verification biased data. Statistical Methods in Medical Research.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
yi.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
yi.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)

Package 'rocvb'

Help Index

Confidence Intervals for AUC Under MAR Verification

Description

Usage

Arguments

Details

Value

References

Examples

Confidence Intervals for Sensitivity at Fixed Level of Specificity Under MAR Verification

Description

Usage

Arguments

Details

Value

References

Examples

Confidence Intervals for Youden Index Under MAR Verification

Description

Usage

Arguments

Details

Value

References

Examples