ORACLE-ClinicalTrials-SuccessProb-v1

A probability estimate of clinical trial success, combining population, mechanism, and outcome-pattern signals.

ORACLE is the ensemble classifier in the OntologerMed suite. It takes the three 256-dimensional embedding vectors produced by PACT, MOAt, and FATE — concatenated into a single 768-dimensional feature vector — and outputs a calibrated probability of trial success, along with a three-tier risk classification.

Where each embedding model answers a single question (who was enrolled, what mechanism, what outcome pattern), ORACLE combines all three into one number: how likely is this trial to succeed?

Model Overview

Property	Value
Model name	ORACLE-ClinicalTrials-SuccessProb-v1
Model type	Gradient-boosted ensemble + Platt calibration
Input	768-dim concatenated vector: [PACT (256) \| MOAt (256) \| FATE (256)]
Output	Calibrated success probability (0–1) + risk tier
Training set	12,692 completed trials with verified outcome labels
Validation set	2,720 trials
Test set	2,720 trials
Positive rate	58.5% (success-labelled trials)
License	Apache 2.0
HuggingFace	`Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1`

Performance

Split	ROC-AUC	Average Precision	Brier Score
Train	0.791	0.829	0.185
Validation	0.736	0.776	0.202
Test	0.734	0.782	0.205

ROC-AUC 0.734 on held-out test set: the model ranks a randomly chosen successful trial above a randomly chosen failed trial 73.4% of the time
Average Precision 0.782: strong performance on the positive class, relevant for screening workflows where identifying likely successes matters most
Brier Score 0.205: well-calibrated probability estimates (0.25 = random, lower is better)

Risk Tiers

Tier	Probability range	Interpretation
High confidence	≥ 0.676	Design profile consistent with historically successful trials
Moderate	0.488 – 0.676	Mixed signals; warrants deeper review
Caution	< 0.488	Design profile closer to historical failures

Thresholds were calibrated on the validation set to balance precision and recall across tiers.

How It Works

ORACLE is a second-stage model that sits on top of the three OntologerMed embedding models. It does not read text directly. It reads the embedding vectors.

Trial text
    ↓
[PACT]  → 256-dim population vector
[MOAt]  → 256-dim mechanism vector       → concatenate → 768-dim feature → ORACLE → P(success)
[FATE]  → 256-dim outcome-pattern vector

Each embedding dimension contributes a different type of signal:

PACT tells ORACLE what population was enrolled — certain populations have historically better success rates than others
MOAt tells ORACLE what mechanism was targeted — validated mechanisms in proven indications differ from first-in-class attempts
FATE tells ORACLE what the trial's outcome-pattern neighborhood looks like — the most direct historical comparator signal

The ensemble classifier learns the interaction between these three signals. A trial with a validated mechanism (MOAt) in a well-studied population (PACT) that sits in a high-success historical neighborhood (FATE) scores very differently from a first-in-class mechanism in an underserved population with few historical comparators.

Architecture

768-dim input vector
    → Gradient-boosted ensemble classifier (base_model.joblib)
    → Raw score
    → Platt scaling calibrator (calibrator.joblib)
    → Calibrated probability P(success)

Classifier: Gradient-boosted ensemble (scikit-learn), trained on 12,692 labeled trials
Calibration: Platt scaling to ensure probability outputs are well-calibrated, not just discriminative
Feature space: 768 dimensions — 256 each from PACT, MOAt, FATE — concatenated in that order
Labels: Binary — 1 = primary endpoint success, 0 = completed trial with negative primary endpoint

Business Use & Applications

ORACLE is the synthesis layer of the OntologerMed suite — turning three independent similarity signals into a single actionable score for workflows that need to rank, prioritise, or screen large numbers of trials.

Investment & Due Diligence

Pipeline scoring at scale — score an entire company's pipeline in seconds, not weeks
- Embed all pipeline assets through PACT, MOAt, and FATE, then pass through ORACLE to rank by predicted success probability
- Triage a 20-asset portfolio into high/moderate/caution tiers before allocating analyst time
- Compare the probability distribution across a portfolio company's pipeline against sector benchmarks
M&A and licensing prioritisation — rank acquisition targets by evidence-adjusted success probability
- Score assets across multiple acquisition targets simultaneously and rank for further diligence
- Identify assets with high ORACLE scores but low market visibility — potential undervalued opportunities
- Use ORACLE scores to stress-test management's stated confidence levels against historical pattern evidence
Fund-level portfolio monitoring — track success probability signals across a portfolio continuously
- Re-score portfolio company assets as new trials register or amend — identify deteriorating signals early
- Flag assets whose ORACLE score drops significantly following a protocol amendment or competitor readout
- Aggregate portfolio-level success probability distribution for LP reporting

Pharmaceutical & Biotech R&D

Internal portfolio prioritisation — rank R&D programmes by historical success-pattern strength
- Score internal pipeline assets and rank for capital allocation decisions
- Identify programmes in low-confidence tiers that may warrant additional mechanistic validation before Phase 3
- Use ORACLE scores as one input in go/no-go decisions at phase transitions
Competitive asset monitoring — score competitor trials as they register
- Automatically score all newly registered Phase 2/3 trials in your indication for rapid competitive assessment
- Identify high-scoring competitor trials that may represent significant threats or partnership opportunities
- Track how ORACLE scores evolve for competitor assets as they amend protocols

Clinical Research Organisations (CROs)

Bid/no-bid risk quantification — add an objective risk score to commercial feasibility assessments
- Score a prospective client's trial through ORACLE as part of the bid evaluation process
- Quantify the historical risk pattern to inform resourcing, timeline, and success fee structuring
- Benchmark the client's trial against historical trials in the same tier to set realistic expectations

HealthTech & Clinical Intelligence Platforms

Trial scoring API — expose ORACLE as a risk signal within broader clinical intelligence platforms
- Integrate ORACLE scores into trial registry monitoring tools to automatically flag high/low confidence assets
- Power risk-adjusted trial recommendations for investors, operators, and clinicians
- Combine ORACLE scores with financial and regulatory metadata for richer due diligence products

Example Output

Example 1: High-Confidence Trial

Input: Phase 3 RCT of upadacitinib (JAK inhibitor) vs placebo in bDMARD-failure RA patients. ACR20 primary endpoint at Week 12.

PACT embedding: RA population (well-studied, high historical density)
MOAt embedding: JAK inhibitor mechanism (validated across multiple approved drugs)
FATE embedding: Neighborhood success rate = 82% (17/20 neighbors = success)

ORACLE output:
  P(success) = 0.81
  Tier: HIGH CONFIDENCE
  Signal: Validated mechanism in proven population with strong historical outcome neighborhood.

Example 2: Caution Tier

Input: Phase 2 single-arm study of novel gene therapy in ultra-rare metabolic disorder. 40 patients. No approved comparator. First-in-class mechanism.

PACT embedding: Rare metabolic disease (sparse population neighborhood, few historical trials)
MOAt embedding: Novel gene therapy mechanism (few historical neighbors, early-class signal)
FATE embedding: Neighborhood success rate = 41% (mixed historical outcomes in rare disease gene therapy)

ORACLE output:
  P(success) = 0.34
  Tier: CAUTION
  Signal: First-in-class mechanism, sparse population history, mixed outcome neighborhood.
  Note: Low score in rare disease gene therapy does not indicate poor science — it reflects
        limited historical precedent. Warrants mechanism-level review, not dismissal.

Usage

import joblib
import numpy as np
from sentence_transformers import SentenceTransformer

# Load embedding models
pact  = SentenceTransformer("Ontologer/PACT-ClinicalTrials-Pop-256")
moat  = SentenceTransformer("Ontologer/MOAt-ClinicalTrials-MoA-256")
fate  = SentenceTransformer("Ontologer/FATE-ClinicalTrials-Outcome-256")

# Load ORACLE classifier
oracle     = joblib.load("base_model.joblib")
calibrator = joblib.load("calibrator.joblib")

def score_trial(trial_text: str) -> dict:
    v_pop  = pact.encode([trial_text])   # (1, 256)
    v_moa  = moat.encode([trial_text])   # (1, 256)
    v_fate = fate.encode([trial_text])   # (1, 256)

    features = np.concatenate([v_pop, v_moa, v_fate], axis=1)  # (1, 768)

    raw_score      = oracle.predict_proba(features)[:, 1]
    calibrated_prob = calibrator.predict_proba(raw_score.reshape(-1, 1))[:, 1]

    prob = float(calibrated_prob[0])

    if prob >= 0.676:
        tier = "HIGH CONFIDENCE"
    elif prob >= 0.488:
        tier = "MODERATE"
    else:
        tier = "CAUTION"

    return {"probability": round(prob, 4), "tier": tier}


trial = """
Phase 3 RCT of upadacitinib 15mg QD vs placebo in 600 adults with
moderate-to-severe RA who failed ≥1 bDMARD. Primary endpoint: ACR20 at Week 12.
"""
result = score_trial(trial)
print(result)
# {'probability': 0.81, 'tier': 'HIGH CONFIDENCE'}

Part of the OntologerMed Suite

Model	Role
OntologerMed-ClinicalTrials-Instruct	Generative LM — reasoning, extraction, and summarisation over trial text
FATE-ClinicalTrials-Outcome-256 (TrialPulse)	Outcome-shaped embedding — similarity by historical success/failure pattern
MOAt-ClinicalTrials-MoA-256 (TargetLens)	Mechanism-of-action embedding — similarity by biological pathway
PACT-ClinicalTrials-Pop-256 (PathFinder)	Population embedding — similarity by patient demographics and disease
ORACLE-ClinicalTrials-SuccessProb-v1	Classifier — probability estimate combining all three embedding dimensions

ORACLE is the synthesis layer. It requires FATE, MOAt, and PACT to generate features. All three must be run before calling ORACLE.

Limitations

Probability, not certainty: A score of 0.81 does not mean 81% of such trials will succeed. It means this trial's embedding profile is consistent with historically high-success patterns. Individual trial outcomes depend on many factors the model cannot observe.
Label selection bias: The 18K labeled trials are a subset of 308K+ completed trials. Trials with formal outcome documentation skew toward commercially sponsored, well-resourced programmes.
Phase and indication interactions: ORACLE does not explicitly encode phase or indication as structured features — these are captured implicitly through the embedding signals, which may underperform in very sparse areas of the feature space.
First-in-class limitations: Novel mechanisms with few historical neighbors produce sparse embedding signals. Caution tier scores in genuinely novel areas reflect data scarcity, not scientific weakness.
Not medical or investment advice: ORACLE scores are a research signal. They should be one input among many in any clinical, regulatory, or financial decision.

Files

ORACLE-ClinicalTrials-SuccessProb-v1/
├── base_model.joblib      # Trained gradient-boosted classifier
├── calibrator.joblib      # Platt scaling calibrator
├── config.json            # Thresholds and feature dimension info
└── metrics.json           # Train/val/test performance metrics

Citation

@misc{oracle-clinicaltrials-2026,
  title        = {ORACLE-ClinicalTrials-SuccessProb-v1: Ensemble Probability-of-Success Classifier for Clinical Trial Intelligence},
  author       = {Mishra, Sid},
  year         = {2026},
  note         = {Gradient-boosted ensemble classifier combining PACT, MOAt, and FATE embeddings. Trained on 18,132 completed ClinicalTrials.gov trials with verified outcome labels.},
  howpublished = {\url{https://huggingface.co/Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1}}
}

About the Author

Sid Mishra — Founder, Ontologer · Convixion AI

Sid is the founder of several AI-native and AI-powered startups and initiatives, based in Singapore. He founded Ontologer as the dedicated AI research arm of Convixion AI, with a focus on building domain-specific language models from the ground up — including data pipelines, training infrastructure, evaluation frameworks, and production deployment.

Ontologer generates novel LLM and embedding models purpose-built for use within Convixion AI's Commercializer.ai platform. ORACLE is the synthesis layer of the OntologerMed suite — combining three specialist embedding models into a single actionable score. Ontologer performs every step of model development — dataset curation, training infrastructure, evaluation, and production deployment — in-house.

Collaboration & Custom Work

Sid is open to collaborating on:

Custom ensemble classifiers — combining multiple domain-specific embeddings into probability estimates for proprietary use cases
End-to-end LLM and embedding pipelines — from data curation to training to production deployment
Evaluation framework design — task-specific benchmarks and calibration assessment
RAG + embedding system design — pairing domain-adapted models with retrieval systems for production use
Custom model architecture consulting — base model selection, training strategy, hardware planning


Site	ontologer.com
Email	sid@ontologer.com · sid@convixion.ai
LinkedIn	linkedin.com/in/sid-m-427b9865

Downloads last month: 8