Recommender Systems in Insurance

/home/notes/ml/blog
← back to blogRecommender Systems in InsurancePersonalizing coverage recommendations while respecting regulatory constraints
Chaïmae Sriti•August 2025
Contents1. Introduction
2. Collaborative Filtering
3. Content-Based Filtering
4. Hybrid Approaches
5. Regulatory Constraints
6. Practical Applications
1. IntroductionRecommender systems in insurance serve a fundamentally different purpose than in e-commerce or entertainment. Instead of maximizing engagement or purchases, they must balance personalization with risk management, regulatory compliance, and customer protection.
The core challenge: recommend coverages that match customer needs without creating adverse selection, violating anti-discrimination laws, or undermining the insurer's loss ratio.
Key Use Cases:
Cross-sell and upsell recommendations (e.g., suggesting umbrella policy to homeowners)
Coverage limit optimization (e.g., recommending appropriate liability limits)
Bundling strategies (e.g., auto + home discounts)
Retention interventions (e.g., suggesting policy adjustments before renewal)
2. Collaborative FilteringCollaborative filtering identifies patterns in customer behavior: "Customers similar to you also purchased..."
User-Based Collaborative Filtering:
Find customers with similar profiles and recommend what they purchased.
# Example: User similarity based on demographics
Customer A: Age 35, Married, 2 Cars, Homeowner
Customer B: Age 37, Married, 2 Cars, Homeowner ← Similar
Customer C: Age 24, Single, 1 Car, Renter ← Different

If Customer B has Umbrella policy → Recommend to Customer A
Item-Based Collaborative Filtering:
Find products frequently purchased together.
# Example: Product co-occurrence matrix
                Auto    Home    Umbrella    Life
Auto            1       0.65    0.42        0.28
Home            0.65    1       0.73        0.31
Umbrella        0.42    0.73    1           0.19
Life            0.28    0.31    0.19        1

→ If customer has Auto + Home, recommend Umbrella (0.73 correlation)
⚠️ Challenges in Insurance:
Sparse data: Customers don't buy insurance frequently (unlike Netflix watching patterns)
Cold start problem: New customers have no purchase history
Regulatory risk: Similar customers might be defined by protected classes (age, gender, location)
Adverse selection: Recommending high-risk products to high-risk customers worsens loss ratios
3. Content-Based FilteringContent-based methods recommend products based on customer attributes and product features, rather than relying on behavior of other customers.
Feature Engineering for Recommendations:
Customer Features:
- Risk profile: Credit score, claims history, driving record
- Life stage: Age, marital status, number of dependents
- Assets: Home value, vehicle count, business ownership
- Coverage gaps: Uninsured/underinsured exposures

Product Features:
- Coverage type: Property, Liability, Life, Health
- Premium range: Budget, Standard, Premium
- Complexity: Simple (Term Life) vs Complex (Whole Life)
- Required prerequisites: Must have home to buy umbrella
Example Rule-Based Recommendation:
IF (has_home AND home_value > $500k AND has_auto) THEN
    recommend(Umbrella_Policy,
              min_limit = max(home_value, 1M),
              priority = HIGH)

IF (age > 30 AND has_dependents AND no_life_insurance) THEN
    recommend(Term_Life,
              coverage = 10 * annual_income,
              priority = CRITICAL)

IF (liability_limit < 100k AND net_worth > 250k) THEN
    recommend(Liability_Increase,
              suggested_limit = min(net_worth, 500k),
              priority = MEDIUM)
✓ Advantages:
Transparent and explainable to regulators
No cold start problem—works for new customers
Can encode actuarial expertise and business rules
Easier to audit for fairness and compliance
4. Hybrid ApproachesHybrid systems combine collaborative filtering, content-based methods, and contextual signals to produce more robust recommendations.
Weighted Ensemble:
final_score = (
    0.4 * collaborative_score +    # What similar customers bought
    0.3 * content_score +           # Match customer-product features
    0.2 * business_rules_score +    # Actuarial constraints
    0.1 * contextual_score          # Time, location, external events
)

# Filter recommendations:
- Remove products customer already has
- Enforce prerequisites (e.g., need home for umbrella)
- Check underwriting eligibility
- Respect opt-out preferences
- Cap cross-sell attempts per period
Matrix Factorization with Constraints:
Use collaborative filtering but inject domain constraints as regularization.
# Factorize customer-product matrix into latent features
User embeddings: [risk_aversion, wealth, life_stage, ...]
Product embeddings: [complexity, premium, coverage_breadth, ...]

Prediction: score = user_embedding · product_embedding

# Add constraints during training:
- Penalize recommendations that violate business rules
- Add fairness constraints (demographic parity, equalized odds)
- Incorporate profitability signals (expected LTV - acquisition cost)
Contextual Bandits:
Treat recommendations as a reinforcement learning problem: learn which products to recommend in which contexts to maximize long-term customer value.
Context: Customer profile + time + channel
Action: Recommend product X
Reward: +1 if purchased, +5 if retained, -2 if churned

Thompson Sampling or UCB to balance:
- Exploitation: Recommend known high-converting products
- Exploration: Test new recommendations to discover better strategies
5. Regulatory ConstraintsInsurance recommendations must navigate strict regulations around fairness, transparency, and consumer protection.
Protected Classes:
Cannot use race, religion, national origin, gender (in most states)
Age restrictions vary by product and jurisdiction
Credit-based features face increasing scrutiny
ZIP code can be proxy for protected classes → must audit for disparate impact
Explainability Requirements:
Regulators and customers may ask: "Why was this product recommended to me?"
Black-box models (deep neural nets) are harder to defend
Provide reason codes: "Recommended because you own a home valued over $500k"
Allow customers to challenge or opt-out of automated recommendations
Anti-Steering Regulations:
In some jurisdictions, insurers cannot systematically recommend cheaper or worse coverage to certain demographic groups.
Monitor recommendation distribution across protected classes
Ensure high-value products are recommended equitably
Test for disparate impact using A/B tests and statistical parity metrics
Fairness Metrics for Recommendations:
# Demographic Parity:
P(recommend_premium_product | group=A) ≈ P(recommend_premium_product | group=B)

# Equalized Odds:
P(accept_recommendation | qualified, group=A) ≈ P(accept_recommendation | qualified, group=B)

# Calibration:
For customers scored at 70% likelihood to need umbrella policy,
~70% should actually need it (across all groups)
6. Practical ApplicationsUse Case 1: Cross-Sell at Renewal
Goal: Recommend additional products when customer renews existing policy.
Pipeline:
1. Identify renewal cohort 60 days before expiration
2. Score each customer for cross-sell propensity (logistic regression)
3. Generate top-3 product recommendations per customer
4. Filter by underwriting rules and profitability thresholds
5. Serve recommendations via email, mobile app, or agent dashboard
6. Track conversion and adjust weights monthly

Metrics:
- Cross-sell conversion rate: 8% → 12% after recommendation system
- Average products per customer: 1.4 → 1.7
- Customer retention: +3% (bundled customers churn less)
Use Case 2: Coverage Gap Analysis
Goal: Identify customers who are underinsured and recommend appropriate increases.
Example:
Customer has $100k liability on auto policy
Net worth estimated at $800k (home value + retirement accounts)
→ High risk of being sued beyond coverage limits

Recommendation:
- Increase liability to $300k (+$15/month)
- OR add umbrella policy with $1M limit (+$20/month)

Delivery:
- Show calculator: "If sued for $500k, you'd pay $400k out of pocket"
- Emphasize protection of assets
- Provide easy one-click upgrade option
Use Case 3: Life Event Triggers
Goal: Detect life events and proactively recommend relevant coverage.
Signals:
- New vehicle added → Recommend comprehensive/collision
- Address change to more expensive home → Recommend higher dwelling coverage
- New driver added (teenage child) → Recommend higher liability limits
- Marriage detected → Recommend life insurance, umbrella policy
- Business ownership → Recommend commercial policy

Timing:
- Trigger recommendation within 30 days of detected event
- Use gentle nudge messaging (not aggressive sales)
- Respect communication preferences and frequency caps
Implementation Best Practices:
Start simple: Rule-based recommendations first, add ML incrementally
A/B test everything: Measure impact on conversion, retention, profitability, fairness
Monitor for drift: Customer behavior and product mix change over time
Audit for fairness: Quarterly reviews of recommendation distribution across demographics
Respect customer preferences: Allow opt-out, control frequency, provide explanations
Integrate with underwriting: Don't recommend products customer won't qualify for
Measure long-term value: Not just immediate conversion, but retention and profitability
Evaluation Metrics:
Business Metrics:
- Conversion rate: % of recommendations accepted
- Revenue per recommendation: Average premium increase
- Customer lifetime value: Long-term retention + cross-sell
- Loss ratio impact: Are recommended products profitable?

Model Metrics:
- Precision@K: Of top-K recommendations, how many convert?
- Recall@K: Of all products customer needs, how many in top-K?
- NDCG: Normalized Discounted Cumulative Gain (ranking quality)
- Coverage: % of customers receiving relevant recommendations

Fairness Metrics:
- Demographic parity across protected classes
- Equalized opportunity (true positive rate parity)
- Calibration (predicted propensity matches actual conversion)
Recommender systems in insurance must balance competing objectives: personalization vs. fairness, short-term conversion vs. long-term profitability, automation vs. transparency. Success requires not just strong ML models, but deep integration with underwriting, regulatory compliance, and customer experience design.