
Opposing Counsel: "You used AI to select documents for production. Explain your methodology."
Attorney: "We trained a model on relevant documents and it ranked the rest."
OC: "How many documents did you train it on?"
Attorney: "I... don't know. The vendor handled that."
OC: "So you can't prove you didn't withhold relevant documents? I move to compel full manual review—all 500,000 documents."
Judge: "Motion granted."
The Cost: $2M in manual review that could've been avoided with a defensible TAR protocol.
*Three Requirements** (established in *Da Silva Moore v. Publicis*, 2012 and subsequent cases):
Three Requirements (established in Da Silva Moore v. Publicis, 2012 and subsequent cases):
Key Insight: You don't need 100% recall. Courts accept 75-80% if the process is defensible.

What: Senior attorney reviews 500-2,000 documents, labels as relevant/not relevant.
Why: This "trains" the AI on what relevance looks like.
Documentation:
Mistake to Avoid: Using junior associates (opposing counsel will argue they're not qualified).
What: AI learns from seed set, ranks all 500,000 documents by predicted relevance.
Documentation:
Mistake to Avoid: "Black box" vendor models (you must be able to explain how it works).
What: Attorney reviews top-ranked documents first. AI learns from each review, re-ranks remaining docs.
Example:
Documentation:
What: After CAL, randomly sample 200-500 docs from the "not relevant" pile. Have attorney review.
Why: Proves you didn't miss relevant docs.
Documentation:
Red Flag: If >10% of "not relevant" sample is actually relevant → model is under-performing, need to retrain.
What: Estimate: Of all relevant documents in the 500,000-doc corpus, what % did we find?
Formula (simplified):
Estimated Recall = Relevant Docs Found / (Relevant Docs Found + Relevant Docs Missed in Sample)Click to examine closely
Example:
Target: 75%+ recall
Mistake to Avoid: Not estimating recall (opposing counsel will assume you missed everything).
What: Prove TAR saved money compared to manual review.
Documentation:
Manual Review Cost: - 500,000 docs × 6 minutes/doc × $400/hour = $2,000,000 TAR Cost: - Seed set: 2,000 docs × 6 min/doc × $400/hour = $8,000 - CAL: 10,000 docs × 6 min/doc × $400/hour = $40,000 - QC sample: 500 docs × 6 min/doc × $400/hour = $2,000 - Vendor fee: $50,000 - Total: $100,000 Savings: $1,900,000 (95% cost reduction)Click to examine closely
Why This Matters: Judge weighs cost vs. benefit. If TAR saves $1.9M and achieves 78% recall, it's defensible.
What: Disclose methodology before starting TAR. Invite opposing counsel to negotiate protocol.
Example Communication:
Subject: Proposed TAR Protocol for Document Review Opposing Counsel, We propose using Technology-Assisted Review for this case. Proposed methodology: 1. Seed set: 2,000 docs, labeled by Senior Partner [Name] 2. Algorithm: SVM with TF-IDF (vendor: Relativity) 3. Continuous active learning: 5-10 rounds 4. QC sampling: 500 docs from "not relevant" pile 5. Target recall: 75%+ We're open to discussing this protocol. Please advise if you have concerns. Regards, [Attorney]Click to examine closely
Why This Matters: Courts favor cooperation. If you negotiate upfront, opposing counsel can't challenge later.
Case: Vendor sues client for breach of contract. 500,000 emails to review.
TAR Protocol:
Seed Set (Week 1):
Model Training (Week 1):
Continuous Active Learning (Weeks 2-4):
Quality Control (Week 5):
Recall Estimation:
Fix (Week 6):
Production (Week 7):
Opposing Counsel's Challenge:
Attorney's Response:
Judge's Ruling: TAR protocol is defensible. No additional review required.
Disclose these to opposing counsel:
If you withhold any of these, opposing counsel will cry foul.

Mistake 1: Not Documenting Seed Set Creation
Mistake 2: No Recall Estimation
Mistake 3: "Black Box" Models
Alex Welcing is a Senior AI Product Manager in New York who builds legal tech products that pass court scrutiny. His TAR workflows are defensible because transparency, proportionality, and quality control are product requirements, not afterthoughts.