AURA

Iteration 2
Build Architect
Schema
Signal & Map
Features
4
Synthetic Test
Schema Analyzer Agent

Input the schema of your dataset for feature development. The agent analyzes each column for AML relevance.

[Schema Analyzer] Received DDL input (1 table)
[Schema Analyzer] Sending to Gemini Flash for analysis...
[Schema Analyzer] Parsed 12 columns from `transactions`
[Schema Analyzer] 10 usable, 2 warnings. Schema ready.
1 table parsed SCH-20260320
transactions 12 cols 5,242,816 rows
ColumnTypeInferenceStatus
account_idVARCHAR(20)Primary account identifierUsable
txn_dateDATETransaction timestampUsable
txn_amtDECIMAL(15,2)Transaction amount (local currency)Usable
txn_typeVARCHAR(10)CR/DR indicatorUsable
channelVARCHAR(20)Transaction channel (ATM, branch, online)Usable
counterparty_idVARCHAR(20)Beneficiary/sender IDUsable
branch_idVARCHAR(10)Originating branchUsable
country_codeCHAR(2)Destination country (ISO 3166)Usable
currencyCHAR(3)Transaction currencyUsable
is_cashBOOLEANCash-based transaction flagUsable
memoTEXTFree-text memo fieldSkip 92% null
internal_flagINTInternal processing flagSkip system field
Signal Normalizer & Red Flag Mapper Agents

The Signal Normalizer extracts structured red flags from AML indicators, then the Red Flag Mapper maps each one to your schema columns.

[Signal Normalizer] Received raw signal input (paragraph format)
[Signal Normalizer] Sending to Gemini Flash for extraction...
[Signal Normalizer] Extracted 5 red flags, confidence: 0.92
[Red Flag Mapper] Mapping 5 red flags against 10 usable columns...
[Red Flag Mapper] Complete: 4 mapped, 1 partial, 0 gaps
Cash Structuring via Multiple Branches SIG-A1B2C3
paragraph
Red Flags (5)
Multiple cash deposits just below $10,000 reporting threshold95%
Transactions structured across multiple branches within short time periods92%
Rapid movement of funds to high-risk jurisdictions after deposit88%
Unusually high volume of cash transactions relative to account profile90%
Use of multiple accounts with no apparent business relationship85%
Signal Impact Landscape
  • Geographic: Canada, Caribbean
  • Channels: Branch, ATM
  • Entities: Individual, Shell company
  • Sector: Banking, MSB
Red Flag Mappings — 5 red flags
Schema has strong coverage for cash structuring patterns. All key columns (txn_amt, is_cash, branch_id, channel) are available.
Multiple cash deposits just below $10,000 reporting thresholdMAPPED
txn_amtis_cashtxn_type
Filter txn_type='CR' AND is_cash=TRUE, count deposits in [$8,000-$9,999] range per account within rolling 30-day window
Transactions across multiple branches within short time periodsMAPPED
branch_idtxn_dateaccount_id
Count distinct branch_id per account within 7-day rolling window; flag if >3 unique branches
Rapid fund movement to high-risk jurisdictions after depositMAPPED
country_codetxn_datetxn_amt
Detect outbound transfers to FATF high-risk countries within 48 hours of large cash deposit
Unusually high cash volume relative to account profileMAPPED
txn_amtis_cashaccount_id
Compute 90-day rolling cash volume per account, flag if >3 standard deviations above peer mean
Use of multiple accounts with no apparent business relationshipPARTIAL
counterparty_idaccount_id
Identify clusters of accounts sharing counterparties but lacking common business identifiers
Partial: no KYC/business-type column in schema to verify business relationship
Feature Generator Agent

The agent generates Python feature code for each mapped red flag using a 3-stage pipeline: Perceive (analyze the indicator), Reason (write the code), Act (validate and self-correct).

[Feature Generator] Processing 4 mapped red flags...
[Perceive] Analyzing: cash deposits below threshold
[Reason] Generating Python code...
[Act] Validation passed. Feature: rf_cash_structuring_below_threshold
[Perceive] Analyzing: multiple branch usage
[Reason] Generating Python code...
[Act] Validation passed. Feature: rf_multi_branch_rapid
[Feature Generator] 4/4 features generated. All passed validation.

Generated Features (4)

rf_cash_structuring_below_threshold
PASSnew

Cash Structuring Below Reporting Threshold

def rf_cash_structuring_below_threshold(df, account_id, window_days=30):
    """Count cash deposits in $8,000-$9,999 range within window."""
    mask = (
        (df['txn_type'] == 'CR') &
        (df['is_cash'] == True) &
        (df['txn_amt'] >= 8000) &
        (df['txn_amt'] < 10000)
    )
    acct = df[df['account_id'] == account_id]
    acct_filtered = acct[mask]
    acct_filtered = acct_filtered.sort_values('txn_date')
    count = acct_filtered.rolling(
        f'{window_days}D', on='txn_date'
    )['txn_amt'].count().max()
    return count if pd.notna(count) else 0
n_accounts
518,370
nonzero %
4.2%
mean
0.3841
std
1.7203
Time window: 30 days
rf_multi_branch_rapid
PASSnew

Multi-Branch Transaction Velocity

def rf_multi_branch_rapid(df, account_id, window_days=7):
    """Count distinct branches used within rolling window."""
    acct = df[df['account_id'] == account_id].sort_values('txn_date')
    if len(acct) == 0:
        return 0
    branches_per_window = acct.set_index('txn_date').rolling(
        f'{window_days}D'
    )['branch_id'].apply(lambda x: x.nunique())
    return branches_per_window.max()
n_accounts
518,370
nonzero %
12.7%
mean
1.2054
std
0.8931
Time window: 7 days
rf_rapid_outflow_high_risk
PASSnew

Rapid Outflow to High-Risk Jurisdictions

def rf_rapid_outflow_high_risk(df, account_id, hours=48):
    """Detect outbound to FATF high-risk within 48h of cash deposit."""
    HIGH_RISK = {'IR', 'KP', 'MM', 'PK', 'SY', 'YE'}
    acct = df[df['account_id'] == account_id].sort_values('txn_date')
    deposits = acct[(acct['is_cash']) & (acct['txn_type'] == 'CR')]
    outbound = acct[acct['country_code'].isin(HIGH_RISK)]
    count = 0
    for _, dep in deposits.iterrows():
        window = outbound[
            (outbound['txn_date'] > dep['txn_date']) &
            (outbound['txn_date'] <= dep['txn_date'] + pd.Timedelta(hours=hours))
        ]
        count += len(window)
    return count
n_accounts
518,370
nonzero %
0.8%
mean
0.0142
std
0.2318
Time window: 48 hours
rf_cash_volume_anomaly
PASSnew

Anomalous Cash Volume vs Peer Group

def rf_cash_volume_anomaly(df, account_id, window_days=90):
    """Z-score of cash volume vs peer group mean."""
    cash = df[df['is_cash'] == True]
    peer_vol = cash.groupby('account_id')['txn_amt'].sum()
    acct_vol = peer_vol.get(account_id, 0)
    if peer_vol.std() == 0:
        return 0
    z = (acct_vol - peer_vol.mean()) / peer_vol.std()
    return max(z, 0)
n_accounts
518,370
nonzero %
31.4%
mean
0.4927
std
1.1042
Time window: 90 days
Synthetic Tester Agent

Test your generated features against synthetic customer profiles with known red flags. Measures coverage: how many ground-truth red flags your features can detect.

Select Customer Profile
CID001
Easy PASS
Romance Fraud
Branch, Online | 4 red flags
CID002
Moderate CONDITIONAL
Drug Trafficking
Cash, ATM | 6 red flags
CID003
Hard FAIL
Human Trafficking
Wire, Cash | 5 red flags
CID004
Moderate PASS
Underground Banking
Cash, Wire | 7 red flags
CID005
Easy PASS
Cannabis Proceeds
Cash, Branch | 3 red flags
[Synthetic Tester] Loading profile CID001 (Romance Fraud, Easy)
[Synthetic Tester] Running rf_cash_structuring_below_threshold... triggered (value: 3)
[Synthetic Tester] Running rf_multi_branch_rapid... triggered (value: 4)
[Synthetic Tester] Running rf_rapid_outflow_high_risk... not triggered
[Synthetic Tester] Running rf_cash_volume_anomaly... triggered (value: 2.41)
[Synthetic Tester] Coverage: 75.0% (3/4 red flags covered)
Coverage Results — CID001
75.0% coverage
Features Tested
4
Red Flags Covered
3 / 4
Typology
Romance Fraud
FeatureTriggeredValueMatch
rf_cash_structuring_below_threshold YES 3.00 MATCH
rf_multi_branch_rapid YES 4.00 MATCH
rf_rapid_outflow_high_risk NO
rf_cash_volume_anomaly YES 2.41 MATCH
Uncovered Red Flags (1)
Rapid fund movement to high-risk jurisdictions — CID001 profile uses domestic transfers only, no international component
AURA Chat Agent
Build BUILD-M1X8K initialized. Schema loaded (12 cols), 5 red flags extracted, 4 features generated. How can I help?
Why didn't rf_rapid_outflow_high_risk trigger on CID001?
CID001 (Romance Fraud) primarily involves domestic transfers. The profile has no outbound transactions to FATF high-risk countries, so the feature correctly returned 0. This is expected behavior — the feature targets a different typology pattern.
Should I try CID004 for better coverage?
Yes — CID004 (Underground Banking) includes wire transfers to high-risk jurisdictions. I'd expect all 4 features to trigger. Want me to re-run the test on CID004?