Site Performance Prediction Models
Pick sites on what they will do, not what they say.
How AI-based site performance prediction is reshaping clinical trial site selection — historical recruitment, eligible-population mapping and operational signals.
"Which sites will actually deliver — and which look good on paper but underperform?"
Site selection used to rely on PI relationships. AI prediction models combine historical recruitment, EHR-derived eligible population and operational signals to select sites that recruit on time.
Site selection is moving from relationship-driven to data-driven. Models that combine historical recruitment, EHR eligible counts and operational signals consistently pick faster-recruiting sites.
What we’re seeing in the data.
Historical recruitment is the strongest signal
Past performance predicts ~60% of variance.
EHR-derived eligible counts beat self-reports
Sites overestimate eligible pools by 30–60%.
Operational variables matter
IRB speed, contract turnaround, staff turnover.
How to think about it.
-
01
Pull historical site performance
Recruitment, retention, query rate.
-
02
Map eligible population
EHR-based, indication-specific.
-
03
Score operational variables
IRB speed, contract turnaround.
-
04
Build composite prediction
Weighted model.
What separates a good answer from a defensible one.
EHR-based estimates require compliance.
Cross-validate.
Onc≠cardio.
Where the signal comes from.
Common questions.
Are PI relationships still relevant?
Yes — for engagement, not for selection. Use prediction for selection, relationship for execution.
What ROI is realistic?
10–30% timeline improvement when used disciplined.
Want this answered on your data?
We build decision systems on top of analyses like this — so the next question takes minutes, not weeks.
Talk to a strategist