AI model performance degradation

Overview

AI models reflect the patterns in their training data. As the regulatory landscape evolves — new guidance, new device categories, new vocabulary — a model that was accurate at release can become less accurate over time if it is not retrained. Without explicit monitoring, this drift can remain undetected.

Hazardous situation: AI generates outdated or inaccurate regulatory recommendations because its training data has not kept pace with regulatory change.

How we mitigate AI performance degradation

Periodic re-validation. AI features are revalidated against current data as part of the release process; see Flinn Release & Validation Process and How is Flinn conducting software validation?.
Benchmarked extraction. Extraction logic is benchmarked against representative data with each major update; see the AI Extraction Guide.
Release notes. Each release documents what changed in AI behaviour. Browse the Release Documentation for per-version notes.
Cross-checks remain on the user side. AI outputs should never be used uncritically; see AI Clinical Report Writing, AI Screening Support, and the related guidance in Misinterpretation of data.
User feedback loop. When an AI output looks wrong, Report a problem or a bug. User feedback is one of the primary signals we use to detect drift early.

By combining recurring re-validation, transparent release notes and user-side cross-checks, the residual risk of acting on a degraded model is kept low.