For the findings of a proposed study of diagnostic accuracy to be robust, we require a minimum number of infected and uninfected cases. The total number of samples required to accurately evaluate the diagnostic devices can be calculated using the formulas below.
The number \(\LARGE n\) of specimens needed to obtain precision in diagnostic performance estimates is calculated using the formula
\(\LARGE n = \frac{(1.96+1.28)^2 * (p*(1-p))}{(p-po)^2/m}\)
where
\(\LARGE p\) is the expected sensitivity of the novel diagnostic
\(\LARGE p_{0}\) is the minimum acceptable sensitivity of the novel diagnostic
\(\LARGE m\) is the estimated prevalence of infection/disease/condition/state in the population
When \(\LARGE n\) is chosen this way, you can design the test to ensure that the lower limit of the confidence interval for the estimate of sensitivity/specificity is not likely to exceed \(\LARGE p0\).
This is based on Banoo, S. et al. Evaluation of diagnostic tests for infectious diseases: general principles. Nature Reviews Microbiology 4, S20–S32 (2006).
15.2 Libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
15.5.1 At 1%, 5%, 10% and 15% tolerance in lower limit precision in estimate
df.5pc<- df %>%mutate(po = p-0.05,n =required_specimens_min_acceptable(p = p, po = po, m = m),lower ="five_percent" )df.10pc<- df %>%mutate(po = p-0.10,n =required_specimens_min_acceptable(p = p, po = po, m = m),lower ="ten_percent" )df.15pc<- df %>%mutate(po = p-0.15,n =required_specimens_min_acceptable(p = p, po = po, m = m),lower ="fifteen_percent" )df<-bind_rows(df.5pc,df.10pc,df.15pc) %>%mutate(lower =factor(lower,levels=c("five_percent","ten_percent","fifteen_percent")))rm(df.5pc,df.10pc,df.15pc)
15.6 Draw chart
This shows the values of \(\LARGE n\) (y axis) for various values of \(\LARGE p\)
Coloured lines show different underlying prevalence values and facets show different acceptable levels of precision in the estimate, here 5%, 10% and 15%, indicating that for a given value of \(\LARGE p\) such as 0.8, a precision as low as 0.75, 0.7 or 0.65 would be minimally acceptable.
The ggplotly view allows you to explore results visually.