AML Model Gym — Model Arena

Scenarios

All Scenarios5 CIDs

Aggregate view across all typologies

CID001Easy

Romance Fraud / Money Mule

EMTWireCashConnex

CID002Easy

Drug Trafficking (Opioids)

CashEMTWireEFTConnex

CID003Moderate

Human Trafficking / Sexual Exploitation

EMTCashConnexEFT

CID004Moderate

Underground Banking / Real Estate

WireCashEFTCheque

CID005Easy

Illicit Cannabis Proceeds

EMTCashEFTChequeConnex

Models Compared

3 Deep / 2 Tree / 1 Neural

Scenarios Loaded

3 Easy / 2 Moderate

CAVE

Best Overall

Avg. anomaly score 0.834

CID003

Hardest Scenario

Avg. detection 0.663

Detection Heatmap — Models × Scenarios

Click cell → Explain

Model	CID001 Romance Fraud	CID002 Drug Trafficking	CID003 Human Trafficking	CID004 Underground Banking	CID005 Cannabis	AVG
CWAE	0.92#2Explain →	0.88#3Explain →	0.71#2Explain →	0.83#2Explain →	0.79#5Explain →	0.826
Isolation Forest	0.85#5Explain →	0.91#1Explain →	0.62#5Explain →	0.74#5Explain →	0.87#1Explain →	0.798
Extended IF	0.87#4Explain →	0.89#2Explain →	0.65#4Explain →	0.77#4Explain →	0.85#3Explain →	0.806
Deep IF	0.89#3Explain →	0.86#4Explain →	0.68#3Explain →	0.80#3Explain →	0.82#4Explain →	0.810
AE	0.83#6Explain →	0.84#5Explain →	0.59#6Explain →	0.76#6Explain →	0.81#6Explain →	0.766
CAVE	0.94#1Explain →	0.87#4Explain →	0.73#1Explain →	0.85#1Explain →	0.78#6Explain →	0.834

Reading the heatmap: Each cell shows the anomaly score (0-1) a model assigns to a scenario, with rank among all models. Darker red = higher anomaly score = model considers the scenario more suspicious. Click any cell to drill into the Explainability view for that model × scenario combination.

Typology Detection Profile — Radar Comparison

All Models

CWAE

Isolation Forest

CAVE

AE (dashed)

Extended IF & Deep IF omitted
for readability (similar to IF)

Insight: CWAE and CAVE (deep generative models) show broad, balanced coverage across all typologies, with particular strength in Behavioral and Network anomaly detection. Isolation Forest excels at Structuring and Placement (tabular point anomalies) but underperforms on network and layering patterns. AE is a moderate generalist with no dominant strength.

Channel Capture Analysis

Bias Check

CWAE

22%

20%

18%

15%

Isolation Forest

38%

24%

14%

Extended IF

32%

25%

15%

Deep IF

28%

26%

16%

30%

25%

15%

CAVE

20%

19%

18%

16%

ABM

Cash

Cheque

EFT

EMT

Wire

Connex

⚠ Isolation Forest derives 38% of anomaly signal from Cash channel alone — potential over-capture bias toward structured cash deposits.

Graph vs Tabular Detection

Anomaly Domain

CWAE

65%

Graph 65% Tab 35%

Isolation Forest

15%

Graph 15% Tab 85%

Extended IF

20%

Graph 20% Tab 80%

Deep IF

35%

Graph 35% Tab 65%

25%

Graph 25% Tab 75%

CAVE

70%

Graph 70% Tab 30%

Key takeaway: Tree-based models (IF, Extended IF) are almost entirely tabular — they detect point anomalies in feature space but miss network/relational patterns. CWAE and CAVE leverage graph structure, making them better at detecting layering and money mule networks. Deep IF sits in between as a hybrid.

Anomaly Type Distribution

Type A vs B

CWAE

30%

70%

Isolation Forest

75%

25%

Extended IF

68%

32%

Deep IF

50%

55%

45%

CAVE

25%

75%

Type A: Point anomalies (outlier amounts, single unusual txns)

Type B: Contextual / collective (behavioral patterns over time)

Why this matters: AML investigators care primarily about Type B (behavioral patterns like structuring, rapid in/out, escalation). Models biased toward Type A may generate excessive false positives on legitimate large transactions while missing subtle schemes.

Model Consensus

Agreement

CID001

Romance Fraud

6/6

Strong

CID002

Drug Trafficking

6/6

Strong

CID003

Human Trafficking

3/6

Weak

CID004

Underground Banking

5/6

Moderate

CID005

Cannabis

6/6

Strong

CID003 (Human Trafficking) shows weak consensus — only contextual models (CWAE, CAVE, Deep IF) detect it. This is expected: HT patterns involve subtle behavioral signals (late-night hotels, multi-city movement) that tabular models miss. CID004 is moderate: IF struggles with underground banking patterns that span cheque + wire channels.

Model Summary Cards

Quick Reference

CWAE

Contextual Wasserstein Autoencoder · Deep Generative

Strengths

Contextual anomaly detection via distributional matching
Strong on layering and network patterns
Balanced channel coverage (no over-capture)

Best Scenarios

CID001 Romance Fraud (0.92)
CID002 Drug Trafficking (0.88)
CID004 Underground Banking (0.83)

Explore Explainability →

Isolation Forest

Tree Ensemble · Tabular Isolation

Strengths

Fast, interpretable, low compute cost
Excellent at point anomalies (structured cash)
Best on CID002 Drug Trafficking (0.91)

Weaknesses

38% signal from Cash — over-capture risk
Misses network-based patterns entirely
Worst on CID003, CID004 (contextual schemes)

Explore Explainability →

Extended IF

Extended Tree Ensemble · Improved Isolation

Strengths

Handles feature interactions better than IF
Consistent performer across scenarios
Lower channel bias than standard IF

Best Scenarios

CID002 Drug Trafficking (0.89)
CID001 Romance Fraud (0.87)
CID005 Cannabis (0.85)

Explore Explainability →

Deep IF

Neural + Tree Hybrid · Latent Isolation

Strengths

Combines neural representation with isolation
Balanced Type A / Type B detection (50/50)
Moderate graph awareness (35%)

Best Scenarios

CID001 Romance Fraud (0.89)
CID002 Drug Trafficking (0.86)
CID005 Cannabis (0.82)

Explore Explainability →

Autoencoder · Reconstruction Error

Strengths

Simple architecture, easy to train
Decent generalist — no catastrophic failures
Good baseline model for comparison

Weaknesses

Lowest avg. score (0.766) — rarely best
Struggles most on CID003 (0.59)
EMT over-capture at 30%

Explore Explainability →

CAVE

Contextual Anomaly via Variational Encoding · Deep Generative

Strengths

Highest avg. score (0.834) — best overall
Best on CID001 (0.94) and CID003 (0.73)
Most balanced channel coverage of all models

Weaknesses

High compute cost for training
Less interpretable than tree models
Weakest on CID005 Cannabis (0.78)

Explore Explainability →