Why assurance, not hype, will decide AI’s fate in healthcare
Artificial intelligence is already reading images, summarising notes, predicting risk, and guiding triage. The promise is powerful: earlier diagnosis, lower administrative burden, more personalised care, and better use of scarce resources.
But in health and care, the question isn’t “Can we buy it?” - it’s “Can we trust it?” . That’s where assurance comes in.
Assurance is the repeatable, evidenced process of showing that an AI system is safe, effective, secure, fair, and fit for its intended use, and that it stays that way as data, models, and services change. It combines pre‑market checks (requirements, risk management, clinical evaluation, privacy, security) with post‑market obligations (monitoring, incident response, performance and drift surveillance, and change control). In other words, assurance turns a clever model into a dependable medical tool.
For healthcare, AI sits at the intersection of medical device regulation, data protection, and information security, plus new, AI‑specific expectations. Around the world, suppliers and buyers navigate seemingly overlapping frameworks, including:
- Medical device & clinical safety: risk management and safety cases, software lifecycle, usability, clinical evaluation approaches and post‑market vigilance, and local clinical safety processes (e.g., DCB0129/0160 in the NHS).
- Digital health evidence & effectiveness: proportionate, claim‑linked evidence frameworks (e.g., NICE’s Evidence Standards Framework) and health technology assessment guidance.
- Data protection & security: privacy law, DPIAs, data governance, organisational security management (e.g., ISO/IEC 27001), and local requirements such as the NHS DSP Toolkit.
- AI‑specific risk & governance: emerging AI management and risk frameworks (e.g., AI risk management, model transparency, human oversight, dataset and bias controls), plus algorithm change protocols for adaptive systems.
- Interoperability & deployment: standards for data exchange (e.g., FHIR), auditability, and safe integration into clinical pathways.
No single framework covers everything (yet!). The practical task is mapping your product’s claims and risk to the most relevant obligations, then evidencing them in a way that buyers and regulators recognise.
Earlier detection and prioritisation in pathways under pressure.
Reduced administrative load and more time for patient contact.
Consistency at scale where human variation is high.
New insights from data, enabling targeted interventions.
Bias and inequity baked into training data or labels.
Performance drift when practice, data, or populations shift.
Over‑reliance by users when outputs look authoritative but confidence is misplaced.
Opaque models that can’t support clinical reasoning or challenge.
Security and privacy incidents, especially with large data flows or third‑party integrations.
1) Dynamic models: AI changes with data; static, one‑off certification isn’t enough. Build a change protocol and agree what can change without re‑approval versus what needs re‑assessment.
2) Evidence for claims: Match the level and design of evidence to the claim and risk. Not every feature needs an RCT, but every claim needs proportionate proof and transparent limitations.
3) Fairness and generalisability: Define your intended users and settings precisely. Test performance across relevant sub‑groups, and be explicit about where the model should not be used.
4) Human factors: Design for oversight, contestability, and fail‑safes. Good UX and clear guidance make safe use easier than unsafe use.
5) Operational maturity: Evidence isn’t just papers. Buyers look for incident processes, security controls, monitoring plans, and support that stand up in real services.
6) Post‑market monitoring: Implement ‘living’ assurance, log model versions, monitor real‑world performance and drift, publish change notes, and close the loop with safety governance.
Start with clear, bounded claims. What problem, for whom, in what setting?
Map risks early. Use a structured hazard/risk approach and build mitigations into the design.
Collect proportionate evidence. Tie each claim to a study or evaluation with relevant outcomes and populations.
Make it legible. Publicly explain data sources, validation, known limitations, and the human‑in‑the‑loop model.
Engineer for change. Version control, audit trails, and a documented algorithm change protocol.
Plan the post‑market work. Monitoring thresholds, triggers, and routes for user feedback or incident escalation.
Health systems face a flood of AI proposals but limited capacity to repeat deep technical checks locally. The answer is shared, independent assurance that buyers can trust, complemented by local onboarding.
That’s the philosophy behind ORCHA’s tiered assurance: a fast, recognisable signal at the point of search and shortlisting, with deeper compliance evidence available when a buyer needs it. Suppliers benefit from a single, credible route to show readiness across multiple markets; buyers benefit from reduced duplication and clearer risk visibility.
Treat assurance as a design input, not a late gate.
Align claims, risk, and evidence to recognised frameworks.
Be specific about intended use and contraindications.
Publish meaningful model cards or equivalent transparency notes.
Demonstrate operational controls: security, safety, monitoring, and support.
Embrace continuous assessment, deploy improvements without losing trust.
At ORCHA, we recognised early that broad AI principles alone would not be enough for health systems or suppliers. What the sector needs is operationalised assurance: a structured, repeatable way to translate regulation, risk, and evidence into something assessable, comparable, and trusted across markets.
That’s why we are developing a dedicated AI Assurance framework for digital health. Rather than treating “AI” as a single, uniform capability, the model distinguishes between techniques (such as large language models or predictive algorithms), clinical context, and real-world functionality. This enables proportionate scrutiny: differentiating low-risk automation from high-risk clinical inference, and aligning oversight accordingly.
Grounded in international frameworks (including the EU AI Act, FDA guidance for AI/ML-based software, ISO standards, and global risk management approaches) the module maps regulatory expectations into clear, assessable criteria designed specifically for health technologies.
Crucially, this work is not theoretical. It is being shaped through international collaboration, partner engagement, beta testing, and academic validation, forming part of a wider ambition to establish a globally aligned AI assurance ecosystem.
The result is a layered, function-level assessment model that translates complex technical governance into clear trust signals for buyers, supporting safe adoption without stifling innovation.
To follow the development of ORCHA’s AI assurance work - including upcoming milestones, insights, and opportunities to engage - sign up for AI updates here.
There are no silver bullets in healthcare. AI will succeed when it becomes a reliable safety net, a set of tools we can depend on because they are well‑designed, well‑evidenced, and well‑governed.
That’s not the glamorous part of innovation, but it’s the part that determines whether promising pilots turn into safe, scaled services. When suppliers and buyers meet in the middle, innovation guided by assurance, assurance designed for innovation, AI can deliver on its promise for patients, clinicians, and systems alike.