
Subject: SOC2 Audit Kickoff — Nov 1
Auditor: "We'll need documentation for all AI/ML systems deployed in 2025. Please provide by Oct 25:
You (PM, realizing none of this exists): "I'll… get back to you."
September is AI compliance prep month. If you ship AI features in regulated industries (healthcare, finance, legal, enterprise SaaS), Q4 audits are coming. The companies that pass on the first try? They spent September building the artifacts.
*September**: Last chance to fix gaps before auditors arrive.
October-December: Peak audit season
September: Last chance to fix gaps before auditors arrive.
What Happens If You're Not Ready:
What Happens If You Are Ready:

Goal: Know what you shipped this year.
Tasks:
Template:
| Feature | Launch Date | Risk Level | Compliance Scope | DRI |
|---|---|---|---|---|
| AI email suggestions | Jan 2025 | Medium | SOC2 | PM: Sarah |
| Patient diagnosis assistant | Mar 2025 | High | HIPAA, SOC2 | PM: Alex |
| Resume screening | Jun 2025 | High | GDPR, EU AI Act | PM: Jordan |
Why This Matters: Auditors will ask, "Show me all AI systems." If you say "I don't know," audit fails immediately.
Goal: Document what the AI does, how it was trained, and what risks exist.
Tasks (per AI feature):
Model Card Template:
MODEL CARD: [Feature Name] 1. MODEL DETAILS - Architecture: [e.g., Fine-tuned GPT-4, BERT classifier, XGBoost] - Version: [e.g., v2.3, deployed Aug 15, 2025] - Training Date: [e.g., Aug 1-10, 2025] - Compute: [e.g., 8 A100 GPUs, 24 hours] 2. INTENDED USE - Primary Use Case: [e.g., Suggest email responses for support tickets] - Users: [e.g., Customer support team, 50 agents] - Out-of-Scope Uses: [e.g., Not for legal advice, medical diagnosis] 3. TRAINING DATA - Source: [e.g., Internal support ticket database, 2020-2024] - Volume: [e.g., 500,000 tickets, 2M tokens] - Sampling: [e.g., Random sample, stratified by ticket category] - Preprocessing: [e.g., De-identified PII, removed spam tickets] - Bias Risks: [e.g., Overrepresents US English, underrepresents non-English] 4. EVALUATION - Metrics: [e.g., Accuracy 89%, F1 0.87, Precision 0.85, Recall 0.90] - Test Set: [e.g., 10,000 held-out tickets, same time range] - Fairness Testing: [e.g., Demographic parity within 5pp across user segments] 5. LIMITATIONS - Edge Cases: [e.g., Struggles with sarcasm, multi-language tickets] - Known Failures: [e.g., 8% false positive rate on urgent tickets] - Not Suitable For: [e.g., Legal/medical content, customer complaints] 6. HUMAN OVERSIGHT - Review Process: [e.g., Agent reviews all AI suggestions before sending] - Override Rate: [e.g., 15% of suggestions modified or rejected] - Escalation: [e.g., Agents flag bad suggestions → PM reviews weekly]Click to examine closely
Time Investment: 2-4 hours per feature.
Why This Matters: SOC2 and HIPAA auditors will ask, "How do you ensure AI quality?" Model card = your answer.
Goal: Prove you've identified risks and mitigated them.
Tasks (per AI feature):
Risk Register Template:
| Risk | Likelihood | Impact | Mitigation | Evidence | Status |
|---|---|---|---|---|---|
| AI hallucinates citation | High | High | Human review required; citation validator | Eval report (95% accuracy) | Mitigated |
| Bias against non-English speakers | Medium | Medium | Demographic parity testing; quarterly audit | Fairness audit (within 5pp) | Mitigated |
| Data leak (PII in training set) | Low | Critical | De-identification pipeline; access controls | Penetration test (passed) | Mitigated |
| Model degrades over time | Medium | Medium | Monthly accuracy tracking; auto-alert if under 85% | Monitoring dashboard (live) | Active |
Time Investment: 3-5 hours per feature.
Why This Matters: Auditors will ask, "What could go wrong?" Risk register = proof you thought about it (and fixed it).
Goal: Simulate the audit. Find gaps before the auditor does.
Tasks:
Dry Run Checklist:
If any answer is "no," you have gaps. Fix them before Oct 1.
Bad Answer: "We test the model before launch."
Good Answer: "We use a three-layer eval process:
Evidence: Eval reports, A/B test results, monitoring dashboard screenshots."
Bad Answer: "We'd fix it."
Good Answer: "We have a documented incident response plan:
Evidence: Runbook (link), feature flag screenshot, past incident post-mortem (if applicable)."
Bad Answer: "We use diverse training data."
Good Answer: "We test for demographic parity across user segments:
Evidence: Fairness audit report (latest: Aug 2025), demographic parity test results."
Bad Answer: "Our data science team."
Good Answer: "Access is role-based with audit logs:
Evidence: Access control policy (doc link), access log export (last 90 days)."
Bad Answer: "We de-identify it."
Good Answer: "We follow a documented data lifecycle:
Evidence: Privacy policy (link), de-identification code (GitHub), retention schedule (table), deletion job logs (cron output)."
Feature: AI-generated patient summaries for physicians.
Audit Date: Nov 15, 2025
September Prep:
Week 1: Inventoried feature (high-risk, HIPAA scope, DRI: PM Alex)
Week 2: Built model card
Week 3: Documented risk register
Week 4: Dry run
Audit Result (Nov 15): Zero findings. HIPAA certification renewed.
Time Investment: 12 hours (Sept prep) vs. 40+ hours (remediation if gaps found).

*If any box is unchecked, you have gaps. Fix them in September, not October.**
Documentation:
Evidence:
Processes:
If any box is unchecked, you have gaps. Fix them in September, not October.
Sprint Goal: Make all AI features audit-ready by Oct 1.
Week 1 Tasks:
Week 2 Tasks:
Week 3 Tasks:
Week 4 Tasks:
Standup Questions:
PM: "We need to spend September building compliance artifacts for our AI features. 12-16 hours of team time total."
Eng Lead: "We're already behind on roadmap. Can this wait?"
PM: "If we don't have this ready, the audit will find gaps. Remediation takes 40+ hours. We'll lose Q4 to fire drills. And if we fail SOC2, enterprise deals freeze."
Eng Lead: "What's the alternative?"
PM: "12 hours now, or 40+ hours in November. Your call."
Alex Welcing is a Senior AI Product Manager who treats compliance like a product feature, not an afterthought. His AI systems pass audits on the first try because September is documentation month, not scramble season.