Spaces:
Running
A newer version of the Gradio SDK is available:
6.2.0
12 ANGRY AGENTS - Product Requirements Document
Overview
Concept: AI-powered jury deliberation simulation where 11 AI agents + 1 human player debate real criminal cases. A Judge narrator (ElevenLabs) orchestrates the experience.
Track: MCP in Action - Creative (potentially also Consumer)
Core Value Prop: True autonomous agent behavior - AI jurors reason, argue, persuade, and change their minds based on deliberation.
Sponsor Integration
| Sponsor | Prize | Integration | Priority |
|---|---|---|---|
| LlamaIndex | $1,000 | Case database RAG | HIGH |
| ElevenLabs | Airpods + $2K | Judge narrator voice | HIGH |
| Blaxel | $2,500 | Sandboxed agent execution | MEDIUM |
| Modal | $2,500 | Agent compute | MEDIUM |
| Gemini | $10K credits | Agent reasoning | HIGH |
User Experience Flow
1. CASE PRESENTATION
└─> Judge (ElevenLabs) narrates case summary
└─> Evidence displayed via LlamaIndex RAG
└─> Player reads case file
2. SIDE SELECTION
└─> Player chooses: DEFEND (not guilty) or PROSECUTE (guilty)
└─> Player commits - cannot change
3. INITIAL VOTE
└─> All 12 jurors vote (randomized split based on case)
└─> Vote tally shown: e.g., "7-5 GUILTY"
4. DELIBERATION LOOP
└─> Random 1-4 agents speak per round
└─> Player gets turn (choose strategy → AI crafts argument)
└─> Conviction scores shift based on arguments
└─> Votes may flip
└─> Repeat until: votes stabilize OR player calls vote
5. FINAL VERDICT
└─> Judge announces verdict (ElevenLabs)
└─> Deliberation transcript available
└─> No "win/lose" - just the experience
Technical Architecture
System Overview
┌─────────────────────────────────────────────────────────────────────┐
│ 12 ANGRY AGENTS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ GRADIO UI LAYER │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Jury Box │ │ Chat View │ │ Case File │ │ │
│ │ │ (12 seats) │ │ (dialogue) │ │ (evidence) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ ORCHESTRATOR AGENT │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ GameStateManager │ │ │
│ │ │ - current_phase: presentation|deliberation|verdict │ │ │
│ │ │ - round_number: int │ │ │
│ │ │ - votes: Dict[agent_id, "guilty"|"not_guilty"] │ │ │
│ │ │ - conviction_scores: Dict[agent_id, float] │ │ │
│ │ │ - speaking_queue: List[agent_id] │ │ │
│ │ │ - deliberation_log: List[Turn] │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ TurnManager │ │ │
│ │ │ - select_speakers(1-4 random) │ │ │
│ │ │ - check_vote_stability() │ │ │
│ │ │ - process_vote_changes() │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────────┐ ┌─────────────┐ │
│ │ JUDGE │ │ JUROR AGENTS │ │ PLAYER │ │
│ │ AGENT │ │ (11 total) │ │ AGENT │ │
│ │ │ │ │ │ │ │
│ │ ElevenLabs │ │ ┌─────────────┐ │ │ Hybrid I/O │ │
│ │ TTS Output │ │ │ AgentConfig │ │ │ Strategy │ │
│ │ │ │ │ - persona │ │ │ Selection │ │
│ │ Narration │ │ │ - model │ │ │ │ │
│ │ Verdicts │ │ │ - tools[] │ │ │ Argument │ │
│ │ Summaries │ │ │ - memory │ │ │ Crafting │ │
│ └─────────────┘ │ └─────────────┘ │ └─────────────┘ │
│ │ │ │
│ │ ┌─────────────┐ │ │
│ │ │ JurorMemory │ │ │
│ │ │ - case_view │ │ │
│ │ │ - arguments │ │ │
│ │ │ - reactions │ │ │
│ │ │ - conviction│ │ │
│ │ └─────────────┘ │ │
│ └─────────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────────┐ ┌─────────────┐ │
│ │ LLAMAINDEX │ │ LITELLM │ │ BLAXEL │ │
│ │ │ │ │ │ │ │
│ │ Case RAG │ │ Model Router │ │ Sandbox │ │
│ │ Evidence │ │ - Gemini │ │ Execution │ │
│ │ Precedents │ │ - Claude │ │ │ │
│ │ │ │ - GPT-4 │ │ Agent Tools │ │
│ └─────────────┘ │ - Local │ │ (future) │ │
│ └─────────────────┘ └─────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ MCP SERVER LAYER │ │
│ │ Tools exposed for external AI agents to play as juror │ │
│ │ - mcp_join_jury(case_id) -> seat_assignment │ │
│ │ - mcp_view_evidence(case_id) -> evidence_list │ │
│ │ - mcp_make_argument(argument_type, content) -> response │ │
│ │ - mcp_cast_vote(vote) -> confirmation │ │
│ │ - mcp_view_deliberation() -> transcript │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Data Models
GameState
@dataclass
class GameState:
"""Central game state - managed by Orchestrator."""
# Session
session_id: str
case_id: str
phase: Literal["setup", "presentation", "side_selection",
"initial_vote", "deliberation", "final_vote", "verdict"]
# Rounds
round_number: int = 0
max_rounds: int = 20 # Safety limit
stability_threshold: int = 3 # Rounds without vote change to end
rounds_without_change: int = 0
# Votes
votes: Dict[str, Literal["guilty", "not_guilty"]] = field(default_factory=dict)
vote_history: List[Dict[str, str]] = field(default_factory=list)
# Conviction scores (0.0 = certain not guilty, 1.0 = certain guilty)
conviction_scores: Dict[str, float] = field(default_factory=dict)
# Deliberation
speaking_queue: List[str] = field(default_factory=list)
deliberation_log: List[DeliberationTurn] = field(default_factory=list)
# Player
player_side: Literal["defend", "prosecute"] | None = None
player_seat: int = 7 # Which seat is the player
@dataclass
class DeliberationTurn:
"""A single turn in deliberation."""
round_number: int
speaker_id: str
speaker_name: str
argument_type: str # "evidence", "emotional", "logical", "question", etc.
content: str
target_id: str | None = None # Who they're addressing
impact: Dict[str, float] = field(default_factory=dict) # conviction changes
timestamp: datetime = field(default_factory=datetime.now)
Agent Configuration
@dataclass
class JurorConfig:
"""Configuration for a single juror agent."""
# Identity
juror_id: str
seat_number: int
name: str
emoji: str # For display until sprites ready
# Personality (affects reasoning style)
archetype: str # "rationalist", "empath", "cynic", etc.
personality_prompt: str # Detailed persona prompt
# Behavior modifiers
stubbornness: float # 0.0-1.0, how hard to convince
volatility: float # 0.0-1.0, how much conviction swings
influence: float # 0.0-1.0, how persuasive to others
verbosity: float # 0.0-1.0, how long their arguments are
# Model configuration
model_provider: str # "gemini", "openai", "anthropic", "local"
model_id: str # Specific model ID
temperature: float = 0.7
# Tools (future expansion)
tools: List[str] = field(default_factory=list) # ["web_search", "case_lookup"]
# Memory
memory_window: int = 10 # How many turns to remember in detail
@dataclass
class JurorMemory:
"""Memory state for a single juror."""
juror_id: str
# Case understanding
case_summary: str
key_evidence: List[str]
evidence_interpretations: Dict[str, str] # evidence_id -> interpretation
# Deliberation memory
arguments_heard: List[ArgumentMemory]
arguments_made: List[str]
# Relationships
opinions_of_others: Dict[str, float] # juror_id -> trust/agreement (-1 to 1)
# Internal state
current_conviction: float # 0.0-1.0
conviction_history: List[float]
reasoning_chain: List[str] # Why they believe what they believe
doubts: List[str] # Things that could change their mind
@dataclass
class ArgumentMemory:
"""Memory of a single argument heard."""
speaker_id: str
content_summary: str
argument_type: str
persuasiveness: float # How convincing it was to this juror
counter_points: List[str] # Thoughts against it
round_heard: int
Case Data Model
@dataclass
class CriminalCase:
"""A criminal case for deliberation."""
case_id: str
title: str
summary: str # 2-3 paragraph overview
# Charges
charges: List[str]
# Evidence
evidence: List[Evidence]
# Witnesses
witnesses: List[Witness]
# Arguments
prosecution_arguments: List[str]
defense_arguments: List[str]
# Defendant
defendant: Defendant
# Metadata
difficulty: Literal["clear_guilty", "clear_innocent", "ambiguous"]
themes: List[str] # ["eyewitness", "circumstantial", "forensic", etc.]
# For display
year: int
jurisdiction: str
@dataclass
class Evidence:
"""A piece of evidence."""
evidence_id: str
type: str # "physical", "testimonial", "documentary", "forensic"
description: str
strength_prosecution: float # 0.0-1.0
strength_defense: float # 0.0-1.0
contestable: bool
contest_reason: str | None
@dataclass
class Witness:
"""A witness in the case."""
witness_id: str
name: str
role: str # "eyewitness", "expert", "character", etc.
testimony_summary: str
credibility_issues: List[str]
side: Literal["prosecution", "defense", "neutral"]
The 11 Juror Archetypes
jurors:
- id: "juror_1"
name: "Marcus Webb"
archetype: "rationalist"
emoji: "🧠"
personality: |
You are a retired engineer. You believe only in hard evidence and logical
deduction. Emotional appeals annoy you. You often say "Show me the data."
You change your mind only when presented with irrefutable logical arguments.
stubbornness: 0.8
volatility: 0.2
influence: 0.7
initial_lean: "neutral"
- id: "juror_2"
name: "Sarah Chen"
archetype: "empath"
emoji: "💗"
personality: |
You are a social worker. You always consider the human element - the
defendant's background, circumstances, potential for redemption. You're
easily moved by personal stories but skeptical of cold statistics.
stubbornness: 0.4
volatility: 0.7
influence: 0.5
initial_lean: "defense"
- id: "juror_3"
name: "Frank Russo"
archetype: "cynic"
emoji: "😤"
personality: |
You are a retired cop. You've "seen it all" and believe most defendants
are guilty. You're impatient with naive arguments. You trust law
enforcement evidence highly. Hard to convince toward not guilty.
stubbornness: 0.9
volatility: 0.1
influence: 0.6
initial_lean: "prosecution"
- id: "juror_4"
name: "Linda Park"
archetype: "conformist"
emoji: "😐"
personality: |
You are an accountant who avoids conflict. You tend to agree with whoever
spoke last or with the majority. You rarely initiate arguments but will
echo others. Easy to sway but also easy to sway back.
stubbornness: 0.2
volatility: 0.8
influence: 0.2
initial_lean: "majority"
- id: "juror_5"
name: "David Okonkwo"
archetype: "contrarian"
emoji: "🙄"
personality: |
You are a philosophy professor. You play devil's advocate constantly.
If everyone says guilty, you argue not guilty. You value intellectual
discourse over reaching conclusions. You ask probing questions.
stubbornness: 0.6
volatility: 0.5
influence: 0.8
initial_lean: "minority"
- id: "juror_6"
name: "Betty Morrison"
archetype: "impatient"
emoji: "⏰"
personality: |
You are a busy restaurant owner. You want this over quickly. You make
snap judgments and get frustrated with long debates. You often say
"Can we just vote already?" You're persuaded by confident, brief arguments.
stubbornness: 0.5
volatility: 0.6
influence: 0.3
initial_lean: "first_impression"
- id: "juror_7"
name: "[PLAYER]"
archetype: "player"
emoji: "👤"
personality: "Human player"
stubbornness: null
volatility: null
influence: 0.6
initial_lean: "player_choice"
- id: "juror_8"
name: "Dr. James Wright"
archetype: "detail_obsessed"
emoji: "🔍"
personality: |
You are a forensic accountant. You focus on tiny inconsistencies in
testimony and evidence. You often derail discussions with minutiae.
A single contradiction can completely change your view.
stubbornness: 0.7
volatility: 0.4
influence: 0.5
initial_lean: "neutral"
- id: "juror_9"
name: "Pastor Williams"
archetype: "moralist"
emoji: "⚖️"
personality: |
You are a church leader. You see things in black and white - right and
wrong. You believe in justice but also redemption. Moral arguments
resonate with you more than technical ones.
stubbornness: 0.7
volatility: 0.3
influence: 0.6
initial_lean: "gut_feeling"
- id: "juror_10"
name: "Nancy Cooper"
archetype: "pragmatist"
emoji: "💼"
personality: |
You are a business consultant. You think about consequences - what
happens if we convict an innocent person? What if we free a guilty one?
You weigh costs and benefits. You're persuaded by outcome-focused arguments.
stubbornness: 0.5
volatility: 0.5
influence: 0.6
initial_lean: "calculated"
- id: "juror_11"
name: "Miguel Santos"
archetype: "storyteller"
emoji: "📖"
personality: |
You are a novelist. You think in narratives - does the prosecution's
story make sense? Does the defense's? You're swayed by coherent
narratives and suspicious of stories with plot holes.
stubbornness: 0.4
volatility: 0.6
influence: 0.7
initial_lean: "best_story"
- id: "juror_12"
name: "Robert Kim"
archetype: "wildcard"
emoji: "🎲"
personality: |
You are a retired jazz musician. Your logic is unpredictable - you
might fixate on something no one else noticed, or suddenly change
your mind for unclear reasons. You're creative but inconsistent.
stubbornness: 0.3
volatility: 0.9
influence: 0.4
initial_lean: "random"
Conviction Score Mechanics
How Conviction Changes
def calculate_conviction_change(
juror: JurorConfig,
juror_memory: JurorMemory,
argument: DeliberationTurn,
game_state: GameState
) -> float:
"""
Calculate how much an argument shifts a juror's conviction.
Returns: delta to add to conviction score (-0.3 to +0.3 typically)
"""
# Base impact from argument strength (determined by LLM)
base_impact = evaluate_argument_strength(argument) # -1.0 to 1.0
# Personality modifiers
archetype_modifier = get_archetype_modifier(
juror.archetype,
argument.argument_type
)
# e.g., "rationalist" gets 1.5x from "logical" arguments, 0.5x from "emotional"
# Stubbornness reduces all changes
stubbornness_modifier = 1.0 - (juror.stubbornness * 0.7)
# Volatility adds randomness
volatility_noise = random.gauss(0, juror.volatility * 0.1)
# Relationship modifier - trust the speaker?
trust = juror_memory.opinions_of_others.get(argument.speaker_id, 0.0)
trust_modifier = 1.0 + (trust * 0.3) # -30% to +30%
# Conviction resistance - harder to move extremes
current = juror_memory.current_conviction
extreme_resistance = 1.0 - (abs(current - 0.5) * 0.5)
# Calculate final delta
delta = (
base_impact
* archetype_modifier
* stubbornness_modifier
* trust_modifier
* extreme_resistance
+ volatility_noise
)
# Clamp to reasonable range
return max(-0.3, min(0.3, delta))
def check_vote_flip(juror_memory: JurorMemory) -> bool:
"""Check if conviction score warrants a vote change."""
current_vote_is_guilty = juror_memory.conviction_history[-1] > 0.5
new_conviction = juror_memory.current_conviction
# Hysteresis - need to cross threshold by margin to flip
if current_vote_is_guilty and new_conviction < 0.4:
return True # Flip to not guilty
elif not current_vote_is_guilty and new_conviction > 0.6:
return True # Flip to guilty
return False
Archetype Argument Modifiers
ARCHETYPE_MODIFIERS = {
"rationalist": {
"logical": 1.5,
"evidence": 1.3,
"emotional": 0.4,
"moral": 0.6,
"narrative": 0.7,
"question": 1.2,
},
"empath": {
"logical": 0.6,
"evidence": 0.8,
"emotional": 1.5,
"moral": 1.3,
"narrative": 1.2,
"question": 0.9,
},
"cynic": {
"logical": 0.8,
"evidence": 1.4, # Trusts evidence
"emotional": 0.3,
"moral": 0.5,
"narrative": 0.6,
"question": 0.7,
},
# ... etc for all archetypes
}
Agent Memory Architecture
Memory Layers
┌─────────────────────────────────────────────────────────────┐
│ JUROR MEMORY SYSTEM │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LAYER 1: CASE KNOWLEDGE (LlamaIndex) │ │
│ │ - Full case file indexed │ │
│ │ - Evidence details retrievable │ │
│ │ - Witness statements searchable │ │
│ │ - Persistent across session │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LAYER 2: DELIBERATION MEMORY (Sliding Window) │ │
│ │ - Last N turns in full detail │ │
│ │ - Summarized history beyond window │ │
│ │ - Key moments flagged for long-term │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LAYER 3: REASONING STATE (Agent Internal) │ │
│ │ - Current conviction + reasoning chain │ │
│ │ - Key doubts and certainties │ │
│ │ - Opinions of other jurors │ │
│ │ - Arguments to make / avoid │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LAYER 4: PERSONA (Static) │ │
│ │ - Archetype definition │ │
│ │ - Personality prompt │ │
│ │ - Behavior modifiers │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Memory Injection into Agent Prompt
def build_juror_prompt(
juror: JurorConfig,
memory: JurorMemory,
game_state: GameState,
case: CriminalCase,
task: str # "speak" | "react" | "vote"
) -> str:
"""Build the full prompt for a juror agent."""
prompt = f"""
# JUROR IDENTITY
You are {juror.name}, Juror #{juror.seat_number}.
{juror.personality_prompt}
# THE CASE: {case.title}
{case.summary}
# KEY EVIDENCE YOU REMEMBER
{format_evidence_memory(memory.key_evidence, memory.evidence_interpretations)}
# YOUR CURRENT POSITION
- Conviction: {conviction_to_text(memory.current_conviction)}
- Your reasoning: {' '.join(memory.reasoning_chain[-3:])}
- Your doubts: {', '.join(memory.doubts[:3]) if memory.doubts else 'None currently'}
# RECENT DELIBERATION (Last {len(memory.arguments_heard[-juror.memory_window:])} turns)
{format_recent_turns(memory.arguments_heard[-juror.memory_window:])}
# YOUR OPINIONS OF OTHER JURORS
{format_juror_opinions(memory.opinions_of_others)}
# CURRENT VOTE TALLY
Guilty: {game_state.votes.values().count('guilty')}
Not Guilty: {game_state.votes.values().count('not_guilty')}
# YOUR TASK
{get_task_prompt(task, juror.archetype)}
"""
return prompt
Orchestration Flow
Smolagents Integration
from smolagents import CodeAgent, Tool, LiteLLMModel
from typing import List
class JurorAgent:
"""Wrapper around smolagents CodeAgent for a juror."""
def __init__(self, config: JurorConfig, tools: List[Tool] = None):
self.config = config
self.memory = JurorMemory(juror_id=config.juror_id)
# Model via LiteLLM for flexibility
self.model = LiteLLMModel(
model_id=f"{config.model_provider}/{config.model_id}",
temperature=config.temperature
)
# Default tools (expandable)
default_tools = [
self.create_evidence_lookup_tool(),
self.create_case_query_tool(),
]
self.agent = CodeAgent(
tools=default_tools + (tools or []),
model=self.model,
max_steps=3, # Limit reasoning steps
)
def create_evidence_lookup_tool(self) -> Tool:
"""Tool to look up specific evidence."""
# LlamaIndex query under the hood
pass
def create_case_query_tool(self) -> Tool:
"""Tool to query case details."""
# LlamaIndex query under the hood
pass
async def generate_argument(
self,
game_state: GameState,
case: CriminalCase
) -> DeliberationTurn:
"""Generate this juror's argument for their turn."""
prompt = build_juror_prompt(
self.config,
self.memory,
game_state,
case,
task="speak"
)
response = await self.agent.run(prompt)
return parse_argument_response(response, self.config, game_state)
async def react_to_argument(
self,
argument: DeliberationTurn,
game_state: GameState,
case: CriminalCase
) -> float:
"""React to another juror's argument, update conviction."""
# Update memory with new argument
self.memory.arguments_heard.append(
ArgumentMemory(
speaker_id=argument.speaker_id,
content_summary=summarize_argument(argument.content),
argument_type=argument.argument_type,
persuasiveness=0.0, # Will be calculated
counter_points=[],
round_heard=game_state.round_number
)
)
# Calculate conviction change
delta = calculate_conviction_change(
self.config,
self.memory,
argument,
game_state
)
self.memory.current_conviction += delta
self.memory.current_conviction = max(0.0, min(1.0, self.memory.current_conviction))
self.memory.conviction_history.append(self.memory.current_conviction)
return delta
class OrchestratorAgent:
"""Master agent that coordinates the deliberation."""
def __init__(
self,
jurors: List[JurorAgent],
judge: JudgeAgent,
case: CriminalCase
):
self.jurors = {j.config.juror_id: j for j in jurors}
self.judge = judge
self.case = case
self.state = GameState(
session_id=str(uuid4()),
case_id=case.case_id
)
async def run_deliberation_round(self) -> List[DeliberationTurn]:
"""Run a single round of deliberation."""
self.state.round_number += 1
turns = []
# Select 1-4 random speakers (not player unless it's their turn)
num_speakers = random.randint(1, 4)
available = [j for j in self.jurors.keys() if j != "juror_7"] # Exclude player
speakers = random.sample(available, min(num_speakers, len(available)))
# Each speaker makes argument
for speaker_id in speakers:
juror = self.jurors[speaker_id]
turn = await juror.generate_argument(self.state, self.case)
turns.append(turn)
# All other jurors react
for other_id, other_juror in self.jurors.items():
if other_id != speaker_id and other_id != "juror_7":
delta = await other_juror.react_to_argument(
turn, self.state, self.case
)
turn.impact[other_id] = delta
# Log turn
self.state.deliberation_log.append(turn)
# Check for vote changes
self._process_vote_changes()
# Check stability
if self._votes_changed_this_round(turns):
self.state.rounds_without_change = 0
else:
self.state.rounds_without_change += 1
return turns
def _process_vote_changes(self):
"""Check all jurors for vote flips."""
for juror_id, juror in self.jurors.items():
if juror_id == "juror_7": # Player votes manually
continue
if check_vote_flip(juror.memory):
old_vote = self.state.votes[juror_id]
new_vote = "guilty" if juror.memory.current_conviction > 0.5 else "not_guilty"
self.state.votes[juror_id] = new_vote
# Could trigger announcement
def check_should_end(self) -> bool:
"""Check if deliberation should end."""
# Unanimous verdict
votes = list(self.state.votes.values())
if len(set(votes)) == 1:
return True
# Votes stabilized
if self.state.rounds_without_change >= self.state.stability_threshold:
return True
# Max rounds reached
if self.state.round_number >= self.state.max_rounds:
return True
return False
ElevenLabs Integration
Judge Narrator
from elevenlabs import Voice, generate, stream
class JudgeAgent:
"""The judge/narrator - uses ElevenLabs for voice."""
def __init__(self, voice_id: str = None):
self.voice_id = voice_id or "judge_voice_id" # Configure
self.voice_settings = {
"stability": 0.7,
"similarity_boost": 0.8,
"style": 0.5, # Authoritative
}
async def narrate(self, text: str, stream_output: bool = True) -> bytes:
"""Generate narration audio."""
audio = generate(
text=text,
voice=Voice(voice_id=self.voice_id),
model="eleven_multilingual_v2",
stream=stream_output
)
if stream_output:
return stream(audio)
return audio
def get_case_presentation(self, case: CriminalCase) -> str:
"""Script for presenting the case."""
return f"""
Members of the jury. You are here today to determine the fate of
{case.defendant.name}, who stands accused of {', '.join(case.charges)}.
{case.summary}
You will hear the evidence. You will deliberate. And you will reach
a verdict. The burden of proof lies with the prosecution, who must
prove guilt beyond a reasonable doubt.
Let us begin.
"""
def get_vote_announcement(self, votes: Dict[str, str]) -> str:
"""Script for announcing vote."""
guilty = sum(1 for v in votes.values() if v == "guilty")
not_guilty = 12 - guilty
return f"""
The current vote stands at {guilty} for guilty,
{not_guilty} for not guilty.
{"The jury remains divided." if guilty not in [0, 12] else ""}
{"A unanimous verdict has been reached." if guilty in [0, 12] else ""}
"""
UI Components
Kinetic Text Animation
// For animated text display (like After Effects kinetic typography)
// Will sync with ElevenLabs audio or simulate typing
class KineticText {
constructor(container, options = {}) {
this.container = container;
this.speed = options.speed || 50; // ms per character
this.variance = options.variance || 20; // randomness
}
async display(text, audioUrl = null) {
// If audio provided, sync with it
if (audioUrl) {
return this.displayWithAudio(text, audioUrl);
}
// Otherwise, simulate speaking
return this.displaySimulated(text);
}
async displaySimulated(text) {
this.container.innerHTML = '';
for (let i = 0; i < text.length; i++) {
const char = text[i];
const span = document.createElement('span');
span.textContent = char;
span.style.opacity = '0';
span.style.animation = 'fadeInChar 0.1s forwards';
this.container.appendChild(span);
// Variable delay for natural feel
const delay = this.speed + (Math.random() - 0.5) * this.variance;
await this.sleep(delay);
}
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
Gradio UI Structure
import gradio as gr
def create_ui():
with gr.Blocks(css=CUSTOM_CSS, theme=gr.themes.Base()) as demo:
# State
game_state = gr.State(None)
# Header
gr.HTML("<h1>12 ANGRY AGENTS</h1>")
with gr.Row():
# Left: Jury Box
with gr.Column(scale=1):
gr.Markdown("### The Jury")
jury_box = gr.HTML(render_jury_box) # 12 seats with emojis/votes
vote_tally = gr.HTML() # "7-5 GUILTY"
# Center: Deliberation
with gr.Column(scale=2):
gr.Markdown("### Deliberation Room")
deliberation_chat = gr.Chatbot(
label="Deliberation",
height=400,
show_label=False
)
# Player input
with gr.Row():
strategy_select = gr.Dropdown(
choices=[
"Challenge Evidence",
"Question Witness Credibility",
"Appeal to Reasonable Doubt",
"Present Alternative Theory",
"Address Specific Juror",
"Call for Vote"
],
label="Your Strategy"
)
speak_btn = gr.Button("Speak", variant="primary")
with gr.Row():
pass_btn = gr.Button("Pass Turn")
call_vote_btn = gr.Button("Call Final Vote")
# Right: Case File
with gr.Column(scale=1):
gr.Markdown("### Case File")
case_summary = gr.Markdown()
with gr.Accordion("Evidence", open=False):
evidence_list = gr.HTML()
with gr.Accordion("Witnesses", open=False):
witness_list = gr.HTML()
# Audio player for Judge
audio_output = gr.Audio(label="Judge", autoplay=True, visible=False)
# MCP Server enabled
demo.launch(mcp_server=True)
LlamaIndex Case Database
Index Structure
from llama_index.core import VectorStoreIndex, Document
from llama_index.core.node_parser import SentenceSplitter
class CaseDatabase:
"""LlamaIndex-powered case database."""
def __init__(self, cases_dir: str):
self.cases = self._load_cases(cases_dir)
self.index = self._build_index()
def _build_index(self) -> VectorStoreIndex:
"""Build searchable index of all cases."""
documents = []
for case in self.cases:
# Index case summary
documents.append(Document(
text=case.summary,
metadata={"case_id": case.case_id, "type": "summary"}
))
# Index each piece of evidence
for evidence in case.evidence:
documents.append(Document(
text=f"{evidence.type}: {evidence.description}",
metadata={
"case_id": case.case_id,
"type": "evidence",
"evidence_id": evidence.evidence_id
}
))
# Index witness testimonies
for witness in case.witnesses:
documents.append(Document(
text=f"{witness.name} ({witness.role}): {witness.testimony_summary}",
metadata={
"case_id": case.case_id,
"type": "witness",
"witness_id": witness.witness_id
}
))
parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
nodes = parser.get_nodes_from_documents(documents)
return VectorStoreIndex(nodes)
def query_evidence(self, case_id: str, query: str) -> List[str]:
"""Query evidence for a specific case."""
query_engine = self.index.as_query_engine(
filters={"case_id": case_id}
)
response = query_engine.query(query)
return response.source_nodes
def get_random_case(self, difficulty: str = None) -> CriminalCase:
"""Get a random case, optionally filtered by difficulty."""
if difficulty:
filtered = [c for c in self.cases if c.difficulty == difficulty]
return random.choice(filtered)
return random.choice(self.cases)
Real Case Data Sources
Primary: Old Bailey Online (Historical)
Dataset: 197,745 criminal trials from London's Central Criminal Court (1674-1913)
Access:
- Full XML download: https://orda.shef.ac.uk/articles/dataset/Old_Bailey_Online_XML_Data/4775434
- API: https://www.oldbaileyonline.org/static/API.jsp
- 2,163 trial XML files + 475 Ordinary's Accounts
Data Fields:
- Trial ID, date, defendant name/gender
- Offence category: theft, kill, deception, violent theft, sexual, etc.
- Verdict, punishment
- Full trial transcript text
Why This Works:
- Historical cases avoid sensitivity around modern defendants
- Rich narrative transcripts perfect for agent reasoning
- 18th-century language adds unique flavor
- Verdicts are known (ground truth for comparison)
Integration Example:
import xml.etree.ElementTree as ET
def load_old_bailey_case(xml_path: str) -> CriminalCase:
"""Parse Old Bailey XML into CriminalCase model."""
tree = ET.parse(xml_path)
root = tree.getroot()
return CriminalCase(
case_id=root.find(".//trialAccount").get("id"),
title=f"The Crown v. {root.find('.//persName').text}",
summary=extract_trial_text(root),
charges=[root.find(".//offence").get("category")],
evidence=extract_evidence_from_transcript(root),
difficulty=infer_difficulty_from_verdict(root),
year=int(root.find(".//date").get("year")),
jurisdiction="London, England"
)
Secondary: National Registry of Exonerations (Modern)
Dataset: All U.S. exonerations since 1989 (3,000+ cases)
Access: https://www.law.umich.edu/special/exoneration/Pages/about.aspx
Data Fields:
- Crime type, state, year of conviction/exoneration
- Contributing factors (eyewitness misID, false confession, etc.)
- DNA involvement, sentence served
Why This Works:
- Dramatic "wrongful conviction" cases
- Clear evidence of reasonable doubt
- Tests agents' ability to weigh conflicting evidence types
Fallback: Curated YAML Cases
For demo stability, include 3-5 handcrafted cases in cases/predefined/:
case_001_robbery.yaml- Clear guilty (baseline test)case_002_murder.yaml- Ambiguous (compelling demo)case_003_exoneration.yaml- DNA reversal scenario
This ensures the demo works even if external data sources are unavailable.
File Structure
12_angry_agents/
├── app.py # Gradio entry point
├── PRD.md # This document
├── requirements.txt
├── .env.example
│
├── core/
│ ├── __init__.py
│ ├── game_state.py # GameState, DeliberationTurn models
│ ├── orchestrator.py # OrchestratorAgent
│ ├── conviction.py # Conviction score mechanics
│ └── turn_manager.py # Turn selection, stability check
│
├── agents/
│ ├── __init__.py
│ ├── base_juror.py # JurorAgent base class
│ ├── judge.py # JudgeAgent (ElevenLabs)
│ ├── player.py # PlayerAgent (human interface)
│ └── configs/
│ └── jurors.yaml # 11 juror configurations
│
├── case_db/
│ ├── __init__.py
│ ├── database.py # CaseDatabase (LlamaIndex)
│ ├── models.py # CriminalCase, Evidence, Witness
│ └── cases/
│ ├── case_001.yaml
│ ├── case_002.yaml
│ └── ...
│
├── memory/
│ ├── __init__.py
│ ├── juror_memory.py # JurorMemory management
│ └── summarizer.py # Memory compression
│
├── ui/
│ ├── __init__.py
│ ├── components.py # Gradio components
│ ├── jury_box.py # Jury box renderer
│ ├── chat.py # Deliberation chat
│ └── static/
│ ├── styles.css
│ └── kinetic.js # Text animations
│
├── mcp/
│ ├── __init__.py
│ └── tools.py # MCP tool definitions
│
└── tests/
├── test_conviction.py
├── test_orchestrator.py
└── test_memory.py
Development Phases
Phase 1: Foundation (4-6 hours)
- Project setup, dependencies
- Data models (GameState, Case, Juror)
- Basic Gradio UI skeleton
- Single juror agent working
Phase 2: Multi-Agent (4-6 hours)
- All 11 juror configs
- Orchestrator with turn management
- Conviction score system
- Memory system (basic)
Phase 3: Integration (3-4 hours)
- LlamaIndex case database
- ElevenLabs judge narration
- Player interaction flow
- Vote tracking and stability
Phase 4: Polish (2-3 hours)
- UI animations (kinetic text)
- Jury box visualization
- MCP server tools
- Demo video recording
Success Metrics
- 11 agents deliberating autonomously - TRUE agent behavior
- Judge narrating with ElevenLabs - Audio wow factor
- Conviction scores shifting - Visible persuasion
- Player can influence outcome - Agency
- MCP tools functional - External AI can play
- Runs without crashes - Stability
CRITICAL: Performance Optimizations
The Latency Trap - SOLVED
Problem: If 1 speaker speaks and 11 agents react individually = 12 LLM calls per turn = SLOW
Solution: Batch Jury State Update
class JuryStateManager:
"""
Single LLM call to update ALL silent jurors' conviction scores.
Replaces 11 individual react_to_argument() calls.
"""
async def batch_update_convictions(
self,
argument: DeliberationTurn,
silent_jurors: List[JurorConfig],
juror_memories: Dict[str, JurorMemory],
game_state: GameState
) -> Dict[str, ConvictionUpdate]:
"""
ONE LLM call updates all 11 jurors' reactions.
"""
prompt = f"""
You are simulating how 11 different jurors would react to this argument.
ARGUMENT BY {argument.speaker_name}:
"{argument.content}"
For each juror below, determine:
1. conviction_delta: float (-0.3 to +0.3) - how much their guilt conviction changes
2. reaction: str - brief internal thought (10 words max)
3. persuaded: bool - did this significantly move them?
JURORS:
{self._format_juror_profiles_compact(silent_jurors, juror_memories)}
Respond in JSON:
{{
"juror_1": {{"delta": 0.1, "reaction": "Good point about the timeline", "persuaded": false}},
"juror_2": {{"delta": -0.2, "reaction": "Too emotional, but touching", "persuaded": true}},
...
}}
"""
response = await self.model.generate(prompt)
return parse_batch_response(response)
Result: 1 speaker + 1 batch reaction = 2 LLM calls per turn (not 12)
Active vs Passive Jurors
# Each turn, only 2-3 jurors are "active listeners" (full memory update)
# Others get simplified heuristic updates
def select_active_listeners(game_state: GameState, num: int = 3) -> List[str]:
"""Select jurors who will fully process this turn."""
# Prioritize: jurors on the fence, jurors addressed directly, random
candidates = []
# On the fence (conviction 0.35-0.65)
for jid, memory in juror_memories.items():
if 0.35 < memory.current_conviction < 0.65:
candidates.append((jid, 2)) # Priority 2
# Recently changed vote
for jid in recently_flipped:
candidates.append((jid, 3)) # Priority 3
# Random others
for jid in all_jurors:
candidates.append((jid, 1))
# Weight and select
return weighted_sample(candidates, num)
Context Window Bloat - SOLVED
Problem: deliberation_log grows unbounded
Solution: Aggressive Rolling Summarization
class MemorySummarizer:
"""Compresses old deliberation history."""
SUMMARY_INTERVAL = 5 # Summarize every 5 rounds
KEEP_RECENT = 3 # Keep last 3 turns in full detail
async def maybe_summarize(self, memory: JurorMemory, round_num: int):
"""Compress old turns if needed."""
if round_num % self.SUMMARY_INTERVAL != 0:
return
# Split: recent (keep full) vs old (summarize)
old_turns = memory.arguments_heard[:-self.KEEP_RECENT]
recent_turns = memory.arguments_heard[-self.KEEP_RECENT:]
if not old_turns:
return
# Summarize old turns into compact form
summary = await self._compress_turns(old_turns)
# Replace old turns with summary object
memory.deliberation_summary = summary
memory.arguments_heard = recent_turns
async def _compress_turns(self, turns: List[ArgumentMemory]) -> str:
"""LLM call to compress multiple turns into summary."""
prompt = f"""
Summarize these {len(turns)} deliberation turns into 3-5 bullet points.
Focus on: key arguments made, who was persuasive, major position shifts.
TURNS:
{self._format_turns(turns)}
Respond with bullet points only.
"""
return await self.model.generate(prompt)
# Memory structure with summary
@dataclass
class JurorMemory:
# ... existing fields ...
# Compressed history (replaces old arguments_heard entries)
deliberation_summary: str = "" # "• Juror 3 argued about timeline..."
# Only recent turns in full detail
arguments_heard: List[ArgumentMemory] # Max ~10 entries
LLM Call Budget Per Round
| Action | Calls | Notes |
|---|---|---|
| 1-4 speakers generate arguments | 1-4 | Parallelizable |
| Batch conviction update | 1 | All 11 reactions |
| Memory summarization | 0-1 | Every 5 rounds |
| Judge narration (ElevenLabs) | 1 | Audio only |
| TOTAL | 3-7 | Down from 12-48 |
External Participant System (MCP + Human)
Architecture: Swappable Juror Seats
Any of the 11 AI juror seats can be replaced by:
- External AI Agent (via MCP) - Another AI system joins as juror
- Human Player (via UI) - Additional human joins
- Default AI (Gemini) - Predefined personality
@dataclass
class JurorSeat:
"""A seat in the jury that can be filled by different participant types."""
seat_number: int
participant_type: Literal["ai_default", "ai_external", "human"]
participant_id: str | None = None
# For AI default
config: JurorConfig | None = None
agent: JurorAgent | None = None
# For external (MCP or human)
external_connection: ExternalConnection | None = None
class JuryManager:
"""Manages the 12 jury seats with mixed participant types."""
def __init__(self):
self.seats: Dict[int, JurorSeat] = {}
self._init_default_seats()
def _init_default_seats(self):
"""Initialize all 12 seats with default AI jurors."""
for i in range(1, 13):
if i == 7: # Reserved for primary player
self.seats[i] = JurorSeat(
seat_number=i,
participant_type="human",
participant_id="player_1"
)
else:
config = load_juror_config(i)
self.seats[i] = JurorSeat(
seat_number=i,
participant_type="ai_default",
config=config,
agent=JurorAgent(config)
)
def replace_with_external(
self,
seat_number: int,
participant_type: Literal["ai_external", "human"],
participant_id: str
) -> bool:
"""Replace a default AI with external participant."""
if seat_number == 7:
return False # Primary player seat protected
if seat_number not in self.seats:
return False
self.seats[seat_number] = JurorSeat(
seat_number=seat_number,
participant_type=participant_type,
participant_id=participant_id,
external_connection=ExternalConnection(participant_id)
)
return True
def get_participant_for_turn(self, seat_number: int) -> TurnHandler:
"""Get appropriate handler for a seat's turn."""
seat = self.seats[seat_number]
if seat.participant_type == "ai_default":
return AITurnHandler(seat.agent)
elif seat.participant_type == "ai_external":
return MCPTurnHandler(seat.external_connection)
else: # human
return HumanTurnHandler(seat.participant_id)
MCP Tools for External Participants
# MCP Server exposes these tools for external AI agents
def mcp_join_as_juror(
case_id: str,
preferred_seat: int | None = None
) -> Dict:
"""
Join an active case as a juror.
An external AI agent can take over any non-player seat.
Returns seat assignment and case briefing.
Args:
case_id: The case to join
preferred_seat: Preferred seat number (2-6, 8-12), or None for auto-assign
Returns:
seat_number: Your assigned seat
case_briefing: Summary of the case
your_persona: Suggested personality (can ignore)
current_state: Vote tally, round number
"""
pass
def mcp_get_deliberation_state(case_id: str, seat_number: int) -> Dict:
"""
Get current state of deliberation.
Returns:
recent_arguments: Last 5 arguments made
vote_tally: Current guilty/not-guilty count
your_conviction: Your current conviction score
pending_speakers: Who speaks next
is_your_turn: Whether you should speak now
"""
pass
def mcp_make_argument(
case_id: str,
seat_number: int,
argument_type: str, # "evidence", "emotional", "logical", "question"
content: str,
target_juror: int | None = None
) -> Dict:
"""
Make an argument during your turn.
Returns:
accepted: Whether argument was processed
reactions: Brief summary of jury reactions
vote_changes: Any votes that flipped
"""
pass
def mcp_cast_vote(
case_id: str,
seat_number: int,
vote: Literal["guilty", "not_guilty"]
) -> Dict:
"""
Cast or change your vote.
Returns:
recorded: Confirmation
new_tally: Updated vote count
"""
pass
def mcp_pass_turn(case_id: str, seat_number: int) -> Dict:
"""Pass your turn without speaking."""
pass
Human Join Flow (Additional Players)
1. Primary player starts game (seat 7)
2. Game generates shareable room code
3. Additional humans can join via:
- URL with room code
- Gradio UI "Join as Juror" button
4. They get assigned available seat (2-6, 8-12)
5. When it's their turn, UI prompts for input
6. They see same case file, deliberation history
Model Configuration
Default: Gemini Flash 2.5
# config/models.yaml
default_model:
provider: "gemini"
model_id: "gemini-2.5-flash"
temperature: 0.7
max_tokens: 1024
# Easily swappable per-agent or globally
model_overrides:
judge:
provider: "gemini"
model_id: "gemini-2.5-flash" # Fast for narration scripts
batch_updater:
provider: "gemini"
model_id: "gemini-2.5-flash" # Handles all conviction updates
# Individual juror overrides (optional)
juror_5: # The contrarian philosopher
provider: "anthropic"
model_id: "claude-sonnet-4-20250514"
temperature: 0.9
LiteLLM Integration
from litellm import completion
class ModelRouter:
"""Route to any model via LiteLLM."""
def __init__(self, config_path: str = "config/models.yaml"):
self.config = load_yaml(config_path)
self.default = self.config["default_model"]
def get_model_for(self, agent_id: str) -> Dict:
"""Get model config for specific agent."""
overrides = self.config.get("model_overrides", {})
return overrides.get(agent_id, self.default)
async def generate(
self,
agent_id: str,
prompt: str,
**kwargs
) -> str:
"""Generate completion using appropriate model."""
config = self.get_model_for(agent_id)
response = await completion(
model=f"{config['provider']}/{config['model_id']}",
messages=[{"role": "user", "content": prompt}],
temperature=config.get("temperature", 0.7),
max_tokens=config.get("max_tokens", 1024),
**kwargs
)
return response.choices[0].message.content
Case Data Architecture
Dual Source: Real + Fallback
class CaseLoader:
"""Load cases from real data or fallback to predefined."""
def __init__(
self,
real_data_path: str | None = None,
fallback_path: str = "cases/predefined/"
):
self.real_data_path = real_data_path
self.fallback_path = fallback_path
# Try to load real data
self.real_cases = self._load_real_cases() if real_data_path else []
self.fallback_cases = self._load_fallback_cases()
def get_case(self, case_id: str = None, use_real: bool = True) -> CriminalCase:
"""Get a case, preferring real data if available."""
if case_id:
# Specific case requested
return self._find_case(case_id)
# Random case
if use_real and self.real_cases:
return random.choice(self.real_cases)
return random.choice(self.fallback_cases)
def _load_real_cases(self) -> List[CriminalCase]:
"""Load from real case database (future: LlamaIndex over court records)."""
# TODO: Integrate with real case API/database
# For now, returns empty - falls back to predefined
return []
def _load_fallback_cases(self) -> List[CriminalCase]:
"""Load predefined cases from YAML files."""
cases = []
for file in Path(self.fallback_path).glob("*.yaml"):
case_data = yaml.safe_load(file.read_text())
cases.append(CriminalCase(**case_data))
return cases
# Future: Real case integration
class RealCaseConnector:
"""
Connect to real case databases.
Designed for easy integration later.
"""
def __init__(self):
self.sources = {
"court_listener": CourtListenerAPI(), # Future
"justia": JustiaAPI(), # Future
"local_files": LocalCaseFiles(), # CSV/JSON dumps
}
async def search_cases(
self,
query: str,
filters: Dict = None
) -> List[CriminalCase]:
"""Search across all connected sources."""
pass
async def get_case_details(
self,
source: str,
case_id: str
) -> CriminalCase:
"""Get full case from specific source."""
pass
Execution Environment
Local First, Blaxel Ready
# config/execution.yaml
execution:
mode: "local" # "local" | "blaxel" | "docker"
local:
# No sandbox, runs in process
timeout_seconds: 30
blaxel:
api_key: "${BLAXEL_API_KEY}"
sandbox_id: "12-angry-agents"
persistent: true # Keep sandbox warm
docker:
image: "12-angry-agents:latest"
memory_limit: "2g"
# Usage in code
class ExecutionManager:
"""Swappable execution environment."""
def __init__(self, config_path: str = "config/execution.yaml"):
self.config = load_yaml(config_path)
self.mode = self.config["execution"]["mode"]
def get_executor(self) -> Executor:
if self.mode == "local":
return LocalExecutor()
elif self.mode == "blaxel":
return BlaxelExecutor(self.config["execution"]["blaxel"])
elif self.mode == "docker":
return DockerExecutor(self.config["execution"]["docker"])
async def run_agent_code(self, code: str, context: Dict) -> str:
"""Execute agent-generated code safely."""
executor = self.get_executor()
return await executor.run(code, context)
Player Input: Strategy + Optional Free Text
# Hybrid input: Low friction strategy selection + optional elaboration
ARGUMENT_STRATEGIES = [
{
"id": "challenge_evidence",
"label": "Challenge Evidence",
"prompt_hint": "Point out weaknesses in a specific piece of evidence",
"allows_free_text": True,
},
{
"id": "question_witness",
"label": "Question Witness Credibility",
"prompt_hint": "Raise doubts about a witness's reliability",
"allows_free_text": True,
},
{
"id": "reasonable_doubt",
"label": "Appeal to Reasonable Doubt",
"prompt_hint": "Emphasize the burden of proof",
"allows_free_text": False, # AI handles this
},
{
"id": "alternative_theory",
"label": "Present Alternative Theory",
"prompt_hint": "Suggest what might have really happened",
"allows_free_text": True,
},
{
"id": "address_juror",
"label": "Address Specific Juror",
"prompt_hint": "Respond to or persuade a specific juror",
"requires_target": True,
"allows_free_text": True,
},
{
"id": "free_argument",
"label": "Make Custom Argument",
"prompt_hint": "Say whatever you want",
"allows_free_text": True,
"required_free_text": True,
},
]
# UI Component
def player_input_ui():
with gr.Row():
strategy = gr.Dropdown(
choices=[s["label"] for s in ARGUMENT_STRATEGIES],
label="Your Strategy",
value="Challenge Evidence"
)
target_juror = gr.Dropdown(
choices=["None"] + [f"Juror {i}" for i in range(1, 13) if i != 7],
label="Target (optional)",
visible=False # Show only for "address_juror"
)
free_text = gr.Textbox(
label="Add details (optional)",
placeholder="e.g., 'Focus on the timeline inconsistency'",
max_lines=2,
visible=True
)
return strategy, target_juror, free_text
Open Questions
- Exact ElevenLabs voice ID for judge?
- Should external AI participants see other AI jurors' internal conviction scores? yes configuablein code.
- Max simultaneous external participants (performance)? 12
- Case difficulty selector in UI? no/ random