What is WorldMind Evidence?
WorldMind Evidence is an entity-based licensing system that combines schema-level entity matching with semantic similarity to provide highly accurate answers to clinical queries while avoiding false positives.
Unlike traditional RAG systems that answer every query (leading to hallucinations), WorldMind Evidence uses entity extraction to validate that queries match the available evidence before providing answers.
Entity Extraction: The Key to Accuracy
WorldMind Evidence extracts three key entities from every query to ensure precise matching with clinical evidence:
Endpoint
The clinical measurement or outcome being queried
- • NT50 (neutralization titer)
- • Seroresponse rate
- • Adverse events
- • GMT (geometric mean titer)
Population
The patient group or demographic being studied
- • Children 5-11 Years
- • Adults ≥12 Years
- • Vaccine-experienced
- • Overall study population
Timepoint
When the measurement was taken in the study timeline
- • 1 month post-dose 2
- • Pre-dose 1 (baseline)
- • 7 days after vaccination
- • 6 months follow-up
Why this matters: By extracting and matching these entities, WorldMind Evidence ensures that answers are based on evidence that precisely matches the query's clinical context, dramatically reducing false positives.
Evaluation Results: WorldMind Evidence vs Traditional RAG
Important: RAG's 98% precision is misleading because it only measures in-scope queries. When including out-of-scope queries (which RAG incorrectly answers), WorldMind Evidence achieves 90% overall accuracy vs RAG's 76.5%.
| Metric | WorldMind Evidence | Traditional RAG | Winner |
|---|---|---|---|
| Overall Accuracy | 90% | 76.5% | WorldMind |
| In-Scope Precision | 91% | 98% | RAG |
| Out-of-Scope Recall | 89% | 55% | WorldMind |
| False Positives | 11 | 45 | WorldMind (4x better) |
| False Negatives | 9 | 2 | RAG |
How WorldMind Evidence Works
Query Analysis & Entity Extraction
Extract endpoint, population, and timepoint from the user's query using LLM-based entity recognition.
Schema-Level Filtering
Match query entities against evidence holons to find candidates with matching endpoint, population, and timepoint.
Semantic Similarity Check
Compute cosine similarity between query and schema-matched holons using embeddings (threshold: 0.35 after schema match).
Licensing Decision
Approve if schema match + similarity threshold met, otherwise abstain. Fallback: approve if no schema match but very high similarity (≥0.70).
Answer Generation
Generate natural language answer from matched holons, handling multiple values and timepoints appropriately.
Key Findings
4x Fewer False Positives
WorldMind Evidence produces 11 false positives vs RAG's 45, making it significantly more reliable for clinical decision support.
13.5% Higher Overall Accuracy
When measuring across both in-scope and out-of-scope queries, WorldMind achieves 90% accuracy vs RAG's 76.5%.
89% Out-of-Scope Recall
WorldMind correctly abstains on 89% of out-of-scope queries, while RAG only achieves 55% (answering queries it shouldn't).
Entity-Based Validation Prevents Hallucinations
By requiring entity alignment, WorldMind avoids the "subtle misbinding" problem where RAG provides plausible but incorrect answers.