Results


Out of 713 wallets analyzed:

  • 87% were classified as low-risk (score 0–4)
  • 58 wallets (≈8%) were medium-risk (score 5–9)
  • Only 8 wallets (≈1.6%) were high-risk (score ≥10)

One wallet scored 28 due to a high transaction count and repeated trading pairs.

This demonstrates that the model is conservative and only flags accounts when multiple red flags are combined.

Risk Score Distribution

Graph-Based Risk Structures

Using NetworkX, transaction graphs revealed several laundering-like structures:

  • Circular flows: NFT transfers that return to the sender
  • Strongly Connected Components (SCCs): wallets forming closed mutual loops
  • Star-shaped dispersal hubs: single wallet sending to multiple others

Wallets involved in these structures had risk scores 4.6× higher than average.

This confirms that transaction structure and behavior scoring reinforce each other.

Statistical Validation

Three statistical tests were applied to validate the risk scoring system:

  1. K-means Cluster Analysis
    • Wallets were grouped based on behavioral features
    • High-risk wallets fell into a distinct cluster with average score 9.07
    • 100% of wallets with scores >10 were in this cluster
  2. Temporal Consistency Testing
    • Data split into 3 time periods
    • Average consistency across periods: only 5.6
    • Suggests burst-like laundering activity, not long-term trading
  3. Simulated Expert Rating Correlation
    • Manual scoring of top wallets compared to model outputs
    • Achieved Pearson correlation of 0.96, showing strong agreement