ORACLE-ClinicalTrials-SuccessProb-v1
A probability estimate of clinical trial success, combining population, mechanism, and outcome-pattern signals.
ORACLE is the ensemble classifier in the OntologerMed suite. It takes the three 256-dimensional embedding vectors produced by PACT, MOAt, and FATE β concatenated into a single 768-dimensional feature vector β and outputs a calibrated probability of trial success, along with a three-tier risk classification.
Where each embedding model answers a single question (who was enrolled, what mechanism, what outcome pattern), ORACLE combines all three into one number: how likely is this trial to succeed?
Model Overview
| Property | Value |
|---|---|
| Model name | ORACLE-ClinicalTrials-SuccessProb-v1 |
| Model type | Gradient-boosted ensemble + Platt calibration |
| Input | 768-dim concatenated vector: [PACT (256) | MOAt (256) | FATE (256)] |
| Output | Calibrated success probability (0β1) + risk tier |
| Training set | 12,692 completed trials with verified outcome labels |
| Validation set | 2,720 trials |
| Test set | 2,720 trials |
| Positive rate | 58.5% (success-labelled trials) |
| License | Apache 2.0 |
| HuggingFace | Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1 |
Performance
| Split | ROC-AUC | Average Precision | Brier Score |
|---|---|---|---|
| Train | 0.791 | 0.829 | 0.185 |
| Validation | 0.736 | 0.776 | 0.202 |
| Test | 0.734 | 0.782 | 0.205 |
- ROC-AUC 0.734 on held-out test set: the model ranks a randomly chosen successful trial above a randomly chosen failed trial 73.4% of the time
- Average Precision 0.782: strong performance on the positive class, relevant for screening workflows where identifying likely successes matters most
- Brier Score 0.205: well-calibrated probability estimates (0.25 = random, lower is better)
Risk Tiers
| Tier | Probability range | Interpretation |
|---|---|---|
| High confidence | β₯ 0.676 | Design profile consistent with historically successful trials |
| Moderate | 0.488 β 0.676 | Mixed signals; warrants deeper review |
| Caution | < 0.488 | Design profile closer to historical failures |
Thresholds were calibrated on the validation set to balance precision and recall across tiers.
How It Works
ORACLE is a second-stage model that sits on top of the three OntologerMed embedding models. It does not read text directly. It reads the embedding vectors.
Trial text
β
[PACT] β 256-dim population vector
[MOAt] β 256-dim mechanism vector β concatenate β 768-dim feature β ORACLE β P(success)
[FATE] β 256-dim outcome-pattern vector
Each embedding dimension contributes a different type of signal:
- PACT tells ORACLE what population was enrolled β certain populations have historically better success rates than others
- MOAt tells ORACLE what mechanism was targeted β validated mechanisms in proven indications differ from first-in-class attempts
- FATE tells ORACLE what the trial's outcome-pattern neighborhood looks like β the most direct historical comparator signal
The ensemble classifier learns the interaction between these three signals. A trial with a validated mechanism (MOAt) in a well-studied population (PACT) that sits in a high-success historical neighborhood (FATE) scores very differently from a first-in-class mechanism in an underserved population with few historical comparators.
Architecture
768-dim input vector
β Gradient-boosted ensemble classifier (base_model.joblib)
β Raw score
β Platt scaling calibrator (calibrator.joblib)
β Calibrated probability P(success)
- Classifier: Gradient-boosted ensemble (scikit-learn), trained on 12,692 labeled trials
- Calibration: Platt scaling to ensure probability outputs are well-calibrated, not just discriminative
- Feature space: 768 dimensions β 256 each from PACT, MOAt, FATE β concatenated in that order
- Labels: Binary β 1 = primary endpoint success, 0 = completed trial with negative primary endpoint
Business Use & Applications
ORACLE is the synthesis layer of the OntologerMed suite β turning three independent similarity signals into a single actionable score for workflows that need to rank, prioritise, or screen large numbers of trials.
Investment & Due Diligence
Pipeline scoring at scale β score an entire company's pipeline in seconds, not weeks
- Embed all pipeline assets through PACT, MOAt, and FATE, then pass through ORACLE to rank by predicted success probability
- Triage a 20-asset portfolio into high/moderate/caution tiers before allocating analyst time
- Compare the probability distribution across a portfolio company's pipeline against sector benchmarks
M&A and licensing prioritisation β rank acquisition targets by evidence-adjusted success probability
- Score assets across multiple acquisition targets simultaneously and rank for further diligence
- Identify assets with high ORACLE scores but low market visibility β potential undervalued opportunities
- Use ORACLE scores to stress-test management's stated confidence levels against historical pattern evidence
Fund-level portfolio monitoring β track success probability signals across a portfolio continuously
- Re-score portfolio company assets as new trials register or amend β identify deteriorating signals early
- Flag assets whose ORACLE score drops significantly following a protocol amendment or competitor readout
- Aggregate portfolio-level success probability distribution for LP reporting
Pharmaceutical & Biotech R&D
Internal portfolio prioritisation β rank R&D programmes by historical success-pattern strength
- Score internal pipeline assets and rank for capital allocation decisions
- Identify programmes in low-confidence tiers that may warrant additional mechanistic validation before Phase 3
- Use ORACLE scores as one input in go/no-go decisions at phase transitions
Competitive asset monitoring β score competitor trials as they register
- Automatically score all newly registered Phase 2/3 trials in your indication for rapid competitive assessment
- Identify high-scoring competitor trials that may represent significant threats or partnership opportunities
- Track how ORACLE scores evolve for competitor assets as they amend protocols
Clinical Research Organisations (CROs)
- Bid/no-bid risk quantification β add an objective risk score to commercial feasibility assessments
- Score a prospective client's trial through ORACLE as part of the bid evaluation process
- Quantify the historical risk pattern to inform resourcing, timeline, and success fee structuring
- Benchmark the client's trial against historical trials in the same tier to set realistic expectations
HealthTech & Clinical Intelligence Platforms
- Trial scoring API β expose ORACLE as a risk signal within broader clinical intelligence platforms
- Integrate ORACLE scores into trial registry monitoring tools to automatically flag high/low confidence assets
- Power risk-adjusted trial recommendations for investors, operators, and clinicians
- Combine ORACLE scores with financial and regulatory metadata for richer due diligence products
Example Output
Example 1: High-Confidence Trial
Input: Phase 3 RCT of upadacitinib (JAK inhibitor) vs placebo in bDMARD-failure RA patients. ACR20 primary endpoint at Week 12.
PACT embedding: RA population (well-studied, high historical density)
MOAt embedding: JAK inhibitor mechanism (validated across multiple approved drugs)
FATE embedding: Neighborhood success rate = 82% (17/20 neighbors = success)
ORACLE output:
P(success) = 0.81
Tier: HIGH CONFIDENCE
Signal: Validated mechanism in proven population with strong historical outcome neighborhood.
Example 2: Caution Tier
Input: Phase 2 single-arm study of novel gene therapy in ultra-rare metabolic disorder. 40 patients. No approved comparator. First-in-class mechanism.
PACT embedding: Rare metabolic disease (sparse population neighborhood, few historical trials)
MOAt embedding: Novel gene therapy mechanism (few historical neighbors, early-class signal)
FATE embedding: Neighborhood success rate = 41% (mixed historical outcomes in rare disease gene therapy)
ORACLE output:
P(success) = 0.34
Tier: CAUTION
Signal: First-in-class mechanism, sparse population history, mixed outcome neighborhood.
Note: Low score in rare disease gene therapy does not indicate poor science β it reflects
limited historical precedent. Warrants mechanism-level review, not dismissal.
Usage
import joblib
import numpy as np
from sentence_transformers import SentenceTransformer
# Load embedding models
pact = SentenceTransformer("Ontologer/PACT-ClinicalTrials-Pop-256")
moat = SentenceTransformer("Ontologer/MOAt-ClinicalTrials-MoA-256")
fate = SentenceTransformer("Ontologer/FATE-ClinicalTrials-Outcome-256")
# Load ORACLE classifier
oracle = joblib.load("base_model.joblib")
calibrator = joblib.load("calibrator.joblib")
def score_trial(trial_text: str) -> dict:
v_pop = pact.encode([trial_text]) # (1, 256)
v_moa = moat.encode([trial_text]) # (1, 256)
v_fate = fate.encode([trial_text]) # (1, 256)
features = np.concatenate([v_pop, v_moa, v_fate], axis=1) # (1, 768)
raw_score = oracle.predict_proba(features)[:, 1]
calibrated_prob = calibrator.predict_proba(raw_score.reshape(-1, 1))[:, 1]
prob = float(calibrated_prob[0])
if prob >= 0.676:
tier = "HIGH CONFIDENCE"
elif prob >= 0.488:
tier = "MODERATE"
else:
tier = "CAUTION"
return {"probability": round(prob, 4), "tier": tier}
trial = """
Phase 3 RCT of upadacitinib 15mg QD vs placebo in 600 adults with
moderate-to-severe RA who failed β₯1 bDMARD. Primary endpoint: ACR20 at Week 12.
"""
result = score_trial(trial)
print(result)
# {'probability': 0.81, 'tier': 'HIGH CONFIDENCE'}
Part of the OntologerMed Suite
| Model | Role |
|---|---|
| OntologerMed-ClinicalTrials-Instruct | Generative LM β reasoning, extraction, and summarisation over trial text |
| FATE-ClinicalTrials-Outcome-256 (TrialPulse) | Outcome-shaped embedding β similarity by historical success/failure pattern |
| MOAt-ClinicalTrials-MoA-256 (TargetLens) | Mechanism-of-action embedding β similarity by biological pathway |
| PACT-ClinicalTrials-Pop-256 (PathFinder) | Population embedding β similarity by patient demographics and disease |
| ORACLE-ClinicalTrials-SuccessProb-v1 | Classifier β probability estimate combining all three embedding dimensions |
ORACLE is the synthesis layer. It requires FATE, MOAt, and PACT to generate features. All three must be run before calling ORACLE.
Limitations
- Probability, not certainty: A score of 0.81 does not mean 81% of such trials will succeed. It means this trial's embedding profile is consistent with historically high-success patterns. Individual trial outcomes depend on many factors the model cannot observe.
- Label selection bias: The 18K labeled trials are a subset of 308K+ completed trials. Trials with formal outcome documentation skew toward commercially sponsored, well-resourced programmes.
- Phase and indication interactions: ORACLE does not explicitly encode phase or indication as structured features β these are captured implicitly through the embedding signals, which may underperform in very sparse areas of the feature space.
- First-in-class limitations: Novel mechanisms with few historical neighbors produce sparse embedding signals. Caution tier scores in genuinely novel areas reflect data scarcity, not scientific weakness.
- Not medical or investment advice: ORACLE scores are a research signal. They should be one input among many in any clinical, regulatory, or financial decision.
Files
ORACLE-ClinicalTrials-SuccessProb-v1/
βββ base_model.joblib # Trained gradient-boosted classifier
βββ calibrator.joblib # Platt scaling calibrator
βββ config.json # Thresholds and feature dimension info
βββ metrics.json # Train/val/test performance metrics
Citation
@misc{oracle-clinicaltrials-2026,
title = {ORACLE-ClinicalTrials-SuccessProb-v1: Ensemble Probability-of-Success Classifier for Clinical Trial Intelligence},
author = {Mishra, Sid},
year = {2026},
note = {Gradient-boosted ensemble classifier combining PACT, MOAt, and FATE embeddings. Trained on 18,132 completed ClinicalTrials.gov trials with verified outcome labels.},
howpublished = {\url{https://huggingface.co/Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1}}
}
About the Author
Sid Mishra β Founder, Ontologer Β· Convixion AI
Sid is the founder of several AI-native and AI-powered startups and initiatives, based in Singapore. He founded Ontologer as the dedicated AI research arm of Convixion AI, with a focus on building domain-specific language models from the ground up β including data pipelines, training infrastructure, evaluation frameworks, and production deployment.
Ontologer generates novel LLM and embedding models purpose-built for use within Convixion AI's Commercializer.ai platform. ORACLE is the synthesis layer of the OntologerMed suite β combining three specialist embedding models into a single actionable score. Ontologer performs every step of model development β dataset curation, training infrastructure, evaluation, and production deployment β in-house.
Collaboration & Custom Work
Sid is open to collaborating on:
- Custom ensemble classifiers β combining multiple domain-specific embeddings into probability estimates for proprietary use cases
- End-to-end LLM and embedding pipelines β from data curation to training to production deployment
- Evaluation framework design β task-specific benchmarks and calibration assessment
- RAG + embedding system design β pairing domain-adapted models with retrieval systems for production use
- Custom model architecture consulting β base model selection, training strategy, hardware planning
| Site | ontologer.com |
| sid@ontologer.com Β· sid@convixion.ai | |
| linkedin.com/in/sid-m-427b9865 |
- Downloads last month
- 8