The Model Card Template That Passes FDA Pre-Cert Review

2025-08-11

The FDA Submission That Got Rejected

Startup: "We're submitting our AI diagnostic tool for FDA Pre-Cert."

FDA Reviewer: "Provide documentation: training data, model architecture, evaluation metrics, clinical validation."

Startup: "We have a white paper..."

FDA: "We need structured documentation. Model card, data card, and clinical evaluation report. Resubmit in 6 months."

The Delay: 6 months of scrambling to create documentation that should've existed from day one.

What FDA Pre-Cert Requires (The Checklist)

Three Documents:

1. Model Card: What the AI does, how it was trained, limitations 2. Data Card: Where training data came from, bias testing, quality control 3. Clinical Evaluation Report: Real-world validation, safety monitoring

Timeline:

Without documentation: 12-18 months to approval

With documentation: 6-9 months

Cost Savings: 6 months of eng time + faster time to market

The FDA-Ready Model Card Template

Section 1: Intended Use

What FDA Wants:

Medical condition/disease targeted

Patient population (age, sex, comorbidities)

Clinical setting (hospital, clinic, home use)

User (physician, nurse, patient)

Example:

What NOT to Say: "General health screening" (too vague—FDA will reject)

Section 2: Model Architecture

What FDA Wants:

Algorithm type (e.g., "Gradient boosting classifier")

Input features (e.g., "Age, BMI, blood pressure, family history")

Output (e.g., "Risk score 0-100, with threshold at 70 for high-risk")

Example:

Why This Matters: FDA needs to understand how the AI makes decisions (interpretability requirement).

Section 3: Training Data

What FDA Wants:

Source (where data came from)

Volume (how many patients)

Demographics (age, sex, race, ethnicity)

Date range (when data was collected)

Quality control (how you ensured data accuracy)

Example:

Red Flag: If demographics don't match US population, FDA will ask about bias.

Section 4: Evaluation Metrics

What FDA Wants:

Accuracy, sensitivity, specificity (clinical gold standards)

Performance by demographic subgroup (fairness testing)

Comparison to human clinicians (is AI better?)

Clinical impact (does AI improve patient outcomes?)

Example:

Why This Matters: FDA cares about patient outcomes, not just model accuracy.

Section 5: Limitations and Warnings

What FDA Wants:

Known failure modes (when AI is unreliable)

Contraindications (when NOT to use AI)

Required human oversight (physician must review)

Example:

Why This Matters: FDA wants proof you're not overselling the AI's capabilities.

Section 6: Post-Market Surveillance

What FDA Wants:

How you'll monitor AI performance in production

What triggers a safety alert (accuracy drop, adverse events)

How often you'll retrain/update the model

Example:

Why This Matters: FDA Pre-Cert assumes continuous improvement (not "set it and forget it").

Real Example: Diabetic Retinopathy Detection AI

Product: AI analyzes retinal images, flags diabetic retinopathy.

FDA Submission:

Intended Use: Screen diabetic patients for retinopathy in primary care settings (not ophthalmology clinics).

Model: Convolutional neural network (ResNet-50 architecture)

Training Data: 120,000 retinal images from 5 hospital systems (2015-2020)

Evaluation:

Sensitivity: 92% (FDA target: >85%)

Specificity: 88%

Comparison: Ophthalmologist sensitivity 95% (AI -3pp, acceptable for screening)

Limitations:

Not for patients with cataracts (image quality too poor)

Requires human ophthalmologist to confirm positive findings

Post-Market:

Monthly monitoring: Random sample of 1,000 images re-reviewed by ophthalmologist

Alert: If AI sensitivity drops below 88%, auto-disable pending investigation

FDA Decision: Approved (6 months from submission to clearance).

Why It Worked: Documentation was complete upfront. No back-and-forth with FDA.

The Data Card (Companion to Model Card)

What FDA Wants (separate document):

Data provenance: IRB approval, patient consent, HIPAA compliance

Bias testing: Performance by race, sex, age, socioeconomic status

Data retention: How long you keep training data, why

Data security: Encryption, access controls, audit logs

Example Snippet:

Checklist: Is Your Model Card FDA-Ready?

[ ] Intended use (specific medical condition, patient population, clinical setting)

[ ] Model architecture (algorithm, inputs, outputs, threshold)

[ ] Training data (source, volume, demographics, quality control)

[ ] Evaluation metrics (sensitivity, specificity, AUC, subgroup performance)

[ ] Comparison to human clinician (is AI better/worse?)

[ ] Clinical impact (does AI improve patient outcomes?)

[ ] Limitations (failure modes, contraindications, required oversight)

[ ] Post-market surveillance (monitoring plan, safety reporting, update schedule)

If any box is unchecked, FDA will request more documentation.

Common PM Mistakes

Mistake 1: Claiming "General Purpose" AI

Reality: FDA requires narrow, well-defined medical use cases

Fix: Specify exact condition, population, setting (not "health screening")

Mistake 2: No Bias Testing

Reality: FDA will reject if you haven't tested performance across demographics

Fix: Report sensitivity/specificity by race, sex, age (minimum)

Mistake 3: No Post-Market Plan

Reality: FDA Pre-Cert assumes you'll monitor and update the AI

Fix: Document monitoring frequency, alert triggers, update process

Alex Welcing is a Senior AI Product Manager in New York who writes FDA-ready model cards before submitting medical device AI. His regulatory approvals take 6 months, not 18, because documentation is a product requirement from day one.