Title: | Proportion Estimation with Marginal Proxy Information |
---|---|
Description: | A system contains easy-to-use tools for the conditional estimation of the prevalence of an emerging or rare infectious diseases using the methods proposed in Guerrier et al. (2023) <arXiv:2012.10745>. |
Authors: | Stéphane Guerrier [aut, cre], Maria-Pia Victoria-Feser [aut], Christoph Kuzmics [aut] |
Maintainer: | Stéphane Guerrier <[email protected]> |
License: | AGPL-3 |
Version: | 1.0.0 |
Built: | 2024-10-27 05:49:58 UTC |
Source: | https://github.com/stephaneguerrier/pempi |
Proportion estimated using the MLE and confidence intervals based the asymptotic distribution of the estimator.
conditional_mle( R1 = NULL, R2 = NULL, R3 = NULL, R4 = NULL, n = R1 + R2 + R3 + R4, pi0, gamma = 0.05, alpha0 = 0, alpha = 0, beta = 0, V = NULL, ... )
conditional_mle( R1 = NULL, R2 = NULL, R3 = NULL, R4 = NULL, n = R1 + R2 + R3 + R4, pi0, gamma = 0.05, alpha0 = 0, alpha = 0, beta = 0, V = NULL, ... )
R1 |
A |
R2 |
A |
R3 |
A |
R4 |
A |
n |
A |
pi0 |
A |
gamma |
A |
alpha0 |
A |
alpha |
A |
beta |
A |
V |
A |
... |
Additional arguments. |
A cpreval
object with the structure:
estimate: Estimated proportion.
sd: Estimated standard error of the estimator.
ci_asym: Asymptotic confidence interval at the 1 - gamma confidence level.
gamma: Confidence level (i.e. 1 - gamma) for confidence intervals.
method: Estimation method (in this case mle).
measurement: A vector with (alpha0, alpha, beta).
beta0: Estimated false negative rate of the official procedure.
ci_beta0: Asymptotic confidence interval (1 - gamma confidence level) for beta0.
boundary: A boolean variable indicating if the estimates falls at the boundary of the parameter space.
pi0: Value of pi0 (input value).
sampling: Type of sampling considered ("random" or "weighted").
V: Average sum of squared sampling weights if weighted/stratified is used (otherwise NULL).
n: Sample size.
avar_beta0: Estimated asymptotic variance of beta0
...: Additional parameters.
Stephane Guerrier, Maria-Pia Victoria-Feser, Christoph Kuzmics
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta = 0.05)
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta = 0.05)
Data collected in Austria in 2020 (see e.g. SORA, 2020; Kowarik et al., 2021, for more details), allowing to estimate COVID-19 prevalence.
covid19_austria
covid19_austria
A matrix
with 2290 rows and 3 variables:
Binary variable, 1 if participant i is tested positive in the survey sample, 0 otherwise.
Binary variable, 1 if participant i was declared positive with the official procedure, 0 otherwise.
Sampling weights.
Statistics Austria. 2020. "Prävalenz von SARS-CoV-2-Infektionen liegt bei 0.031."
Compute joint probabilities of P(W = j, Y = k) for j, k = 0, 1.
get_prob(theta, pi0, alpha, beta, alpha0)
get_prob(theta, pi0, alpha, beta, alpha0)
theta |
A |
pi0 |
A |
alpha |
A |
beta |
A |
alpha0 |
A |
A vector
containing tau1, tau2, tau3 and tau4.
Stephane Guerrier
prob1 = get_prob(theta = 0.02, pi0 = 0.01, alpha = 0, beta = 0, alpha0 = 0) prob1 sum(prob1) prob2 = get_prob(theta = 0.02, pi0 = 0.01, alpha = 0.001, beta = 0, alpha0 = 0.001) prob2 sum(prob2)
prob1 = get_prob(theta = 0.02, pi0 = 0.01, alpha = 0, beta = 0, alpha0 = 0) prob1 sum(prob1) prob2 = get_prob(theta = 0.02, pi0 = 0.01, alpha = 0.001, beta = 0, alpha0 = 0.001) prob2 sum(prob2)
Proportion estimated using the MLE and confidence intervals based the asymptotic distribution of the estimator.
marginal_mle( R1, R3, n, pi0, gamma = 0.05, alpha = 0, beta = 0, alpha0 = 0, V = NULL, ... )
marginal_mle( R1, R3, n, pi0, gamma = 0.05, alpha = 0, beta = 0, alpha0 = 0, V = NULL, ... )
R1 |
A |
R3 |
A |
n |
A |
pi0 |
A |
gamma |
A |
alpha |
A |
beta |
A |
alpha0 |
A |
V |
A |
... |
Additional arguments. |
A cpreval
object with the structure:
estimate: Estimated proportion.
sd: Estimated standard error of the estimator.
ci_asym: Asymptotic confidence interval at the 1 - gamma confidence level.
gamma: Confidence level (i.e. 1 - gamma) for confidence intervals.
method: Estimation method (in this case marginal mle).
measurement: A vector with (alpha0, alpha, beta).
beta0: Estimated false negative rate of the official procedure.
ci_beta0: Asymptotic confidence interval (1 - gamma confidence level) for beta0.
boundary: A boolean variable indicating if the estimates falls at the boundary of the parameter space.
pi0: Value of pi0 (input value).
sampling: Type of sampling considered ("random" or "weighted").
V: Average sum of squared sampling weights if weighted/stratified is used (otherwise NULL).
n: Sample size.
avar_beta0: Estimated asymptotic variance of beta0
...: Additional parameters
Stephane Guerrier, Maria-Pia Victoria-Feser, Christoph Kuzmics
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, n = X$n, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05, seed = 18) marginal_mle(R1 = X$R1, R3 = X$R3, n = X$n, pi0 = X$pi0) marginal_mle(R1 = X$R1, R3 = X$R3, n = X$n, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05)
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) conditional_mle(R1 = X$R1, R2 = X$R2, R3 = X$R3, R4 = X$R4, n = X$n, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05, seed = 18) marginal_mle(R1 = X$R1, R3 = X$R3, n = X$n, pi0 = X$pi0) marginal_mle(R1 = X$R1, R3 = X$R3, n = X$n, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta0 = 0.05, beta = 0.05)
Proportion estimated using the moment-based estimator and confidence intervals based the asymptotic distribution of the estimator as well as the Clopper-Pearson approach.
moment_estimator( R3, n, pi0, gamma = 0.05, alpha = 0, beta = 0, alpha0 = 0, V = NULL, ... )
moment_estimator( R3, n, pi0, gamma = 0.05, alpha = 0, beta = 0, alpha0 = 0, V = NULL, ... )
R3 |
A |
n |
A |
pi0 |
A |
gamma |
A |
alpha |
A |
beta |
A |
alpha0 |
A |
V |
A |
... |
Additional arguments. |
A cpreval
object with the structure:
estimate: Estimated proportion.
sd: Estimated standard error of the estimator.
ci_asym: Asymptotic confidence interval at the 1 - gamma confidence level.
ci_cp: Confidence interval (1 - gamma confidence level) based on the Clopper-Pearson approach.
gamma: Confidence level (i.e. 1 - gamma) for confidence intervals.
method: Estimation method (in this case moment estimator).
measurement: A vector with (alpha0, alpha, beta).
beta0: Estimated false negative rate of the official procedure.
ci_beta0: Asymptotic confidence interval (1 - gamma confidence level) for beta0.
boundary: A boolean variable indicating if the estimates falls at the boundary of the parameter space.
pi0: Value of pi0 (input value).
sampling: Type of sampling considered ("random" or "weighted").
V: Average sum of squared sampling weights if weighted/stratified is used (otherwise NULL).
n: Sample size.
avar_beta0: Estimated asymptotic variance of beta0
...: Additional parameters.
Stephane Guerrier, Maria-Pia Victoria-Feser, Christoph Kuzmics
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, alpha0 = 0.001, alpha = 0.01, beta = 0.05, seed = 18) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta = 0.05)
# Samples without measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0) # With measurement error X = sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, alpha0 = 0.001, alpha = 0.01, beta = 0.05, seed = 18) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0) moment_estimator(R3 = X$R3, n = X$n, pi0 = X$pi0, alpha0 = 0.001, alpha = 0.01, beta = 0.05)
Simulation function for random variables of interest.
sim_Rs(theta, pi0, n, alpha0 = 0, alpha = 0, beta = 0, seed = NULL, ...)
sim_Rs(theta, pi0, n, alpha0 = 0, alpha = 0, beta = 0, seed = NULL, ...)
theta |
A |
pi0 |
A |
n |
A |
alpha0 |
A |
alpha |
A |
beta |
A |
seed |
A |
... |
Additional arguments. |
A cpreval_sim
object (list
) with the structure:
R: the number of participants in the survey sample that were tested positive.
R0: the number of participants in the survey sample that were tested positive with the first testing device (and are, thus, members of the sub-population).
R1: the number of participants in the survey sample that were tested positive with both (medical) testing devices (and are, thus, members of the sub-population).
R2: the number of participants in the survey sample that are tested positive only with the first testing device (and are, thus, members of the sub-population).
R3: the number of participants in the survey sample that are tested positive only with the second testing device.
R4: the number of participants that are tested negative with the second testing device (and are either members of the sub-population and have tested negative with the first testing device or are not members of the sub-population).
n: the sample size.
alpha: the False Negative (FN) rate for the sample R.
beta: the False Positive (FP) rate for the sample R.
alpha0: the alpha0 probability (as defined above).
...: additional arguments.
Stephane Guerrier
# Samples without measurement error sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) # With measurement error sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, alpha0 = 0, alpha = 0.01, beta = 0.05, seed = 18)
# Samples without measurement error sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, seed = 18) # With measurement error sim_Rs(theta = 3/100, pi0 = 1/100, n = 1500, alpha0 = 0, alpha = 0.01, beta = 0.05, seed = 18)
Proportion estimated using the survey sample and confidence intervals based on the Clopper-Pearson and the standard asymptotic approach.
survey_mle(R, n, pi0 = 0, alpha = 0, beta = 0, gamma = 0.05, V = NULL, ...)
survey_mle(R, n, pi0 = 0, alpha = 0, beta = 0, gamma = 0.05, V = NULL, ...)
R |
A |
n |
A |
pi0 |
A |
alpha |
A |
beta |
A |
gamma |
A |
V |
A |
... |
Additional arguments. |
A cpreval
object with the structure:
estimate: Estimated proportion.
sd: Estimated standard error of the estimator.
ci_asym: Asymptotic confidence interval at the 1 - gamma confidence level.
gamma: Confidence level (i.e. 1 - gamma) for confidence intervals.
method: Estimation method (in this case sample survey).
measurement: A vector with (alpha0, alpha, beta).
boundary: A boolean variable indicating if the estimates falls at the boundary of the parameter space.
pi0: Value of pi0 (input value).
sampling: Type of sampling considered ("random" or "weighted").
V: Average sum of squared sampling weights if weighted/stratified is used (otherwise NULL).
...: Additional parameters.
Stephane Guerrier, Maria-Pia Victoria-Feser, Christoph Kuzmics
# Samples without measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, seed = 18) survey_mle(R = X$R, n = X$n) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha = 0.01, beta = 0.05, seed = 18) survey_mle(R = X$R, n = X$n) survey_mle(R = X$R, n = X$n, alpha = 0.01, beta = 0.05)
# Samples without measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, seed = 18) survey_mle(R = X$R, n = X$n) # With measurement error X = sim_Rs(theta = 30/1000, pi0 = 10/1000, n = 1500, alpha = 0.01, beta = 0.05, seed = 18) survey_mle(R = X$R, n = X$n) survey_mle(R = X$R, n = X$n, alpha = 0.01, beta = 0.05)
Updated prevalence and confidence intervals using new case prevalence rates
update_prevalence( pi0_new, x, gamma = 0.05, print = NULL, plot = NULL, col_line = "#2e5dc1", col_ci = "#2E5DC133", ... )
update_prevalence( pi0_new, x, gamma = 0.05, print = NULL, plot = NULL, col_line = "#2e5dc1", col_ci = "#2E5DC133", ... )
pi0_new |
A |
x |
A |
gamma |
A |
print |
A |
plot |
A |
col_line |
Color of the estimated prevalence. |
col_ci |
Color of the estimated prevalence confidence interval. |
... |
Additional arguments. |
A matrix
object whose colunms corresponds to pi0, estimate, sd and CI.
Stephane Guerrier
# Austrian data (November 2020) pi0 = 93914/7166167 data("covid19_austria") # Weighted sampling n = nrow(covid19_austria) R1w = sum(covid19_austria$weights[covid19_austria$Y == 1 & covid19_austria$Z == 1]) R2w = sum(covid19_austria$weights[covid19_austria$Y == 0 & covid19_austria$Z == 1]) R3w = sum(covid19_austria$weights[covid19_austria$Y == 1 & covid19_austria$Z == 0]) R4w = sum(covid19_austria$weights[covid19_austria$Y == 0 & covid19_austria$Z == 0]) # Assumed measurement errors alpha0 = 0 alpha = 1/100 beta = 10/100 # MME mme = moment_estimator(R3 = R3w, n = n, pi0 = pi0, alpha = alpha, beta = beta, alpha0 = alpha0, V = mean(covid19_austria$weights^2)) mme # Update prevalence using a new pi0, say = 1.5%, instead of 1.31% update_prevalence(1.5/100, mme) pi0_new = seq(from = 0.005, to = 0.03, length.out = 100) update_prevalence(pi0_new, mme)
# Austrian data (November 2020) pi0 = 93914/7166167 data("covid19_austria") # Weighted sampling n = nrow(covid19_austria) R1w = sum(covid19_austria$weights[covid19_austria$Y == 1 & covid19_austria$Z == 1]) R2w = sum(covid19_austria$weights[covid19_austria$Y == 0 & covid19_austria$Z == 1]) R3w = sum(covid19_austria$weights[covid19_austria$Y == 1 & covid19_austria$Z == 0]) R4w = sum(covid19_austria$weights[covid19_austria$Y == 0 & covid19_austria$Z == 0]) # Assumed measurement errors alpha0 = 0 alpha = 1/100 beta = 10/100 # MME mme = moment_estimator(R3 = R3w, n = n, pi0 = pi0, alpha = alpha, beta = beta, alpha0 = alpha0, V = mean(covid19_austria$weights^2)) mme # Update prevalence using a new pi0, say = 1.5%, instead of 1.31% update_prevalence(1.5/100, mme) pi0_new = seq(from = 0.005, to = 0.03, length.out = 100) update_prevalence(pi0_new, mme)