Results – FYP24027 WordPress Site

Out of 713 wallets analyzed:

87% were classified as low-risk (score 0–4)
58 wallets (≈8%) were medium-risk (score 5–9)
Only 8 wallets (≈1.6%) were high-risk (score ≥10)

One wallet scored 28 due to a high transaction count and repeated trading pairs.

This demonstrates that the model is conservative and only flags accounts when multiple red flags are combined.

Risk Score Distribution

Graph-Based Risk Structures

Using NetworkX, transaction graphs revealed several laundering-like structures:

Circular flows: NFT transfers that return to the sender
Strongly Connected Components (SCCs): wallets forming closed mutual loops
Star-shaped dispersal hubs: single wallet sending to multiple others

Wallets involved in these structures had risk scores 4.6× higher than average.

This confirms that transaction structure and behavior scoring reinforce each other.

Statistical Validation

Three statistical tests were applied to validate the risk scoring system:

K-means Cluster Analysis
- Wallets were grouped based on behavioral features
- High-risk wallets fell into a distinct cluster with average score 9.07
- 100% of wallets with scores >10 were in this cluster
Temporal Consistency Testing
- Data split into 3 time periods
- Average consistency across periods: only 5.6
- Suggests burst-like laundering activity, not long-term trading
Simulated Expert Rating Correlation
- Manual scoring of top wallets compared to model outputs
- Achieved Pearson correlation of 0.96, showing strong agreement

These validation methods demonstrate that the scoring model is not arbitrary—it aligns with natural behavioral groupings, exposes time-bound laundering bursts, and approximates human expert judgment.