ORACLE-ClinicalTrials-SuccessProb-v1

A probability estimate of clinical trial success, combining population, mechanism, and outcome-pattern signals.

ORACLE is the ensemble classifier in the OntologerMed suite. It takes the three 256-dimensional embedding vectors produced by PACT, MOAt, and FATE β€” concatenated into a single 768-dimensional feature vector β€” and outputs a calibrated probability of trial success, along with a three-tier risk classification.

Where each embedding model answers a single question (who was enrolled, what mechanism, what outcome pattern), ORACLE combines all three into one number: how likely is this trial to succeed?


Model Overview

Property Value
Model name ORACLE-ClinicalTrials-SuccessProb-v1
Model type Gradient-boosted ensemble + Platt calibration
Input 768-dim concatenated vector: [PACT (256) | MOAt (256) | FATE (256)]
Output Calibrated success probability (0–1) + risk tier
Training set 12,692 completed trials with verified outcome labels
Validation set 2,720 trials
Test set 2,720 trials
Positive rate 58.5% (success-labelled trials)
License Apache 2.0
HuggingFace Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1

Performance

Split ROC-AUC Average Precision Brier Score
Train 0.791 0.829 0.185
Validation 0.736 0.776 0.202
Test 0.734 0.782 0.205
  • ROC-AUC 0.734 on held-out test set: the model ranks a randomly chosen successful trial above a randomly chosen failed trial 73.4% of the time
  • Average Precision 0.782: strong performance on the positive class, relevant for screening workflows where identifying likely successes matters most
  • Brier Score 0.205: well-calibrated probability estimates (0.25 = random, lower is better)

Risk Tiers

Tier Probability range Interpretation
High confidence β‰₯ 0.676 Design profile consistent with historically successful trials
Moderate 0.488 – 0.676 Mixed signals; warrants deeper review
Caution < 0.488 Design profile closer to historical failures

Thresholds were calibrated on the validation set to balance precision and recall across tiers.


How It Works

ORACLE is a second-stage model that sits on top of the three OntologerMed embedding models. It does not read text directly. It reads the embedding vectors.

Trial text
    ↓
[PACT]  β†’ 256-dim population vector
[MOAt]  β†’ 256-dim mechanism vector       β†’ concatenate β†’ 768-dim feature β†’ ORACLE β†’ P(success)
[FATE]  β†’ 256-dim outcome-pattern vector

Each embedding dimension contributes a different type of signal:

  • PACT tells ORACLE what population was enrolled β€” certain populations have historically better success rates than others
  • MOAt tells ORACLE what mechanism was targeted β€” validated mechanisms in proven indications differ from first-in-class attempts
  • FATE tells ORACLE what the trial's outcome-pattern neighborhood looks like β€” the most direct historical comparator signal

The ensemble classifier learns the interaction between these three signals. A trial with a validated mechanism (MOAt) in a well-studied population (PACT) that sits in a high-success historical neighborhood (FATE) scores very differently from a first-in-class mechanism in an underserved population with few historical comparators.


Architecture

768-dim input vector
    β†’ Gradient-boosted ensemble classifier (base_model.joblib)
    β†’ Raw score
    β†’ Platt scaling calibrator (calibrator.joblib)
    β†’ Calibrated probability P(success)
  • Classifier: Gradient-boosted ensemble (scikit-learn), trained on 12,692 labeled trials
  • Calibration: Platt scaling to ensure probability outputs are well-calibrated, not just discriminative
  • Feature space: 768 dimensions β€” 256 each from PACT, MOAt, FATE β€” concatenated in that order
  • Labels: Binary β€” 1 = primary endpoint success, 0 = completed trial with negative primary endpoint

Business Use & Applications

ORACLE is the synthesis layer of the OntologerMed suite β€” turning three independent similarity signals into a single actionable score for workflows that need to rank, prioritise, or screen large numbers of trials.

Investment & Due Diligence

  • Pipeline scoring at scale β€” score an entire company's pipeline in seconds, not weeks

    • Embed all pipeline assets through PACT, MOAt, and FATE, then pass through ORACLE to rank by predicted success probability
    • Triage a 20-asset portfolio into high/moderate/caution tiers before allocating analyst time
    • Compare the probability distribution across a portfolio company's pipeline against sector benchmarks
  • M&A and licensing prioritisation β€” rank acquisition targets by evidence-adjusted success probability

    • Score assets across multiple acquisition targets simultaneously and rank for further diligence
    • Identify assets with high ORACLE scores but low market visibility β€” potential undervalued opportunities
    • Use ORACLE scores to stress-test management's stated confidence levels against historical pattern evidence
  • Fund-level portfolio monitoring β€” track success probability signals across a portfolio continuously

    • Re-score portfolio company assets as new trials register or amend β€” identify deteriorating signals early
    • Flag assets whose ORACLE score drops significantly following a protocol amendment or competitor readout
    • Aggregate portfolio-level success probability distribution for LP reporting

Pharmaceutical & Biotech R&D

  • Internal portfolio prioritisation β€” rank R&D programmes by historical success-pattern strength

    • Score internal pipeline assets and rank for capital allocation decisions
    • Identify programmes in low-confidence tiers that may warrant additional mechanistic validation before Phase 3
    • Use ORACLE scores as one input in go/no-go decisions at phase transitions
  • Competitive asset monitoring β€” score competitor trials as they register

    • Automatically score all newly registered Phase 2/3 trials in your indication for rapid competitive assessment
    • Identify high-scoring competitor trials that may represent significant threats or partnership opportunities
    • Track how ORACLE scores evolve for competitor assets as they amend protocols

Clinical Research Organisations (CROs)

  • Bid/no-bid risk quantification β€” add an objective risk score to commercial feasibility assessments
    • Score a prospective client's trial through ORACLE as part of the bid evaluation process
    • Quantify the historical risk pattern to inform resourcing, timeline, and success fee structuring
    • Benchmark the client's trial against historical trials in the same tier to set realistic expectations

HealthTech & Clinical Intelligence Platforms

  • Trial scoring API β€” expose ORACLE as a risk signal within broader clinical intelligence platforms
    • Integrate ORACLE scores into trial registry monitoring tools to automatically flag high/low confidence assets
    • Power risk-adjusted trial recommendations for investors, operators, and clinicians
    • Combine ORACLE scores with financial and regulatory metadata for richer due diligence products

Example Output

Example 1: High-Confidence Trial

Input: Phase 3 RCT of upadacitinib (JAK inhibitor) vs placebo in bDMARD-failure RA patients. ACR20 primary endpoint at Week 12.

PACT embedding: RA population (well-studied, high historical density)
MOAt embedding: JAK inhibitor mechanism (validated across multiple approved drugs)
FATE embedding: Neighborhood success rate = 82% (17/20 neighbors = success)

ORACLE output:
  P(success) = 0.81
  Tier: HIGH CONFIDENCE
  Signal: Validated mechanism in proven population with strong historical outcome neighborhood.

Example 2: Caution Tier

Input: Phase 2 single-arm study of novel gene therapy in ultra-rare metabolic disorder. 40 patients. No approved comparator. First-in-class mechanism.

PACT embedding: Rare metabolic disease (sparse population neighborhood, few historical trials)
MOAt embedding: Novel gene therapy mechanism (few historical neighbors, early-class signal)
FATE embedding: Neighborhood success rate = 41% (mixed historical outcomes in rare disease gene therapy)

ORACLE output:
  P(success) = 0.34
  Tier: CAUTION
  Signal: First-in-class mechanism, sparse population history, mixed outcome neighborhood.
  Note: Low score in rare disease gene therapy does not indicate poor science β€” it reflects
        limited historical precedent. Warrants mechanism-level review, not dismissal.

Usage

import joblib
import numpy as np
from sentence_transformers import SentenceTransformer

# Load embedding models
pact  = SentenceTransformer("Ontologer/PACT-ClinicalTrials-Pop-256")
moat  = SentenceTransformer("Ontologer/MOAt-ClinicalTrials-MoA-256")
fate  = SentenceTransformer("Ontologer/FATE-ClinicalTrials-Outcome-256")

# Load ORACLE classifier
oracle     = joblib.load("base_model.joblib")
calibrator = joblib.load("calibrator.joblib")

def score_trial(trial_text: str) -> dict:
    v_pop  = pact.encode([trial_text])   # (1, 256)
    v_moa  = moat.encode([trial_text])   # (1, 256)
    v_fate = fate.encode([trial_text])   # (1, 256)

    features = np.concatenate([v_pop, v_moa, v_fate], axis=1)  # (1, 768)

    raw_score      = oracle.predict_proba(features)[:, 1]
    calibrated_prob = calibrator.predict_proba(raw_score.reshape(-1, 1))[:, 1]

    prob = float(calibrated_prob[0])

    if prob >= 0.676:
        tier = "HIGH CONFIDENCE"
    elif prob >= 0.488:
        tier = "MODERATE"
    else:
        tier = "CAUTION"

    return {"probability": round(prob, 4), "tier": tier}


trial = """
Phase 3 RCT of upadacitinib 15mg QD vs placebo in 600 adults with
moderate-to-severe RA who failed β‰₯1 bDMARD. Primary endpoint: ACR20 at Week 12.
"""
result = score_trial(trial)
print(result)
# {'probability': 0.81, 'tier': 'HIGH CONFIDENCE'}

Part of the OntologerMed Suite

Model Role
OntologerMed-ClinicalTrials-Instruct Generative LM β€” reasoning, extraction, and summarisation over trial text
FATE-ClinicalTrials-Outcome-256 (TrialPulse) Outcome-shaped embedding β€” similarity by historical success/failure pattern
MOAt-ClinicalTrials-MoA-256 (TargetLens) Mechanism-of-action embedding β€” similarity by biological pathway
PACT-ClinicalTrials-Pop-256 (PathFinder) Population embedding β€” similarity by patient demographics and disease
ORACLE-ClinicalTrials-SuccessProb-v1 Classifier β€” probability estimate combining all three embedding dimensions

ORACLE is the synthesis layer. It requires FATE, MOAt, and PACT to generate features. All three must be run before calling ORACLE.


Limitations

  • Probability, not certainty: A score of 0.81 does not mean 81% of such trials will succeed. It means this trial's embedding profile is consistent with historically high-success patterns. Individual trial outcomes depend on many factors the model cannot observe.
  • Label selection bias: The 18K labeled trials are a subset of 308K+ completed trials. Trials with formal outcome documentation skew toward commercially sponsored, well-resourced programmes.
  • Phase and indication interactions: ORACLE does not explicitly encode phase or indication as structured features β€” these are captured implicitly through the embedding signals, which may underperform in very sparse areas of the feature space.
  • First-in-class limitations: Novel mechanisms with few historical neighbors produce sparse embedding signals. Caution tier scores in genuinely novel areas reflect data scarcity, not scientific weakness.
  • Not medical or investment advice: ORACLE scores are a research signal. They should be one input among many in any clinical, regulatory, or financial decision.

Files

ORACLE-ClinicalTrials-SuccessProb-v1/
β”œβ”€β”€ base_model.joblib      # Trained gradient-boosted classifier
β”œβ”€β”€ calibrator.joblib      # Platt scaling calibrator
β”œβ”€β”€ config.json            # Thresholds and feature dimension info
└── metrics.json           # Train/val/test performance metrics

Citation

@misc{oracle-clinicaltrials-2026,
  title        = {ORACLE-ClinicalTrials-SuccessProb-v1: Ensemble Probability-of-Success Classifier for Clinical Trial Intelligence},
  author       = {Mishra, Sid},
  year         = {2026},
  note         = {Gradient-boosted ensemble classifier combining PACT, MOAt, and FATE embeddings. Trained on 18,132 completed ClinicalTrials.gov trials with verified outcome labels.},
  howpublished = {\url{https://huggingface.co/Ontologer/ORACLE-ClinicalTrials-SuccessProb-v1}}
}

About the Author

Sid Mishra β€” Founder, Ontologer Β· Convixion AI

Sid is the founder of several AI-native and AI-powered startups and initiatives, based in Singapore. He founded Ontologer as the dedicated AI research arm of Convixion AI, with a focus on building domain-specific language models from the ground up β€” including data pipelines, training infrastructure, evaluation frameworks, and production deployment.

Ontologer generates novel LLM and embedding models purpose-built for use within Convixion AI's Commercializer.ai platform. ORACLE is the synthesis layer of the OntologerMed suite β€” combining three specialist embedding models into a single actionable score. Ontologer performs every step of model development β€” dataset curation, training infrastructure, evaluation, and production deployment β€” in-house.

Collaboration & Custom Work

Sid is open to collaborating on:

  • Custom ensemble classifiers β€” combining multiple domain-specific embeddings into probability estimates for proprietary use cases
  • End-to-end LLM and embedding pipelines β€” from data curation to training to production deployment
  • Evaluation framework design β€” task-specific benchmarks and calibration assessment
  • RAG + embedding system design β€” pairing domain-adapted models with retrieval systems for production use
  • Custom model architecture consulting β€” base model selection, training strategy, hardware planning
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support