Real-time fraud detection with quantum-inspired feature selection
Catching bad credit-card transactions as they happen
- ROLE
- Team of 3
- TIMEFRAME
- 2026
- STACK
- Python, Spark MLlib, Kafka, QIEA
- LINKS
- github ↗
−90%
INPUTS NEEDED
The problem
Fraud models fight two constraints at once: severe class imbalance, and an inference-latency budget — a score that arrives after the transaction clears is worthless. Every feature you keep costs you at serving time.
Approach
A Spark MLlib pipeline over the ULB credit-card dataset, with feature selection handled by a quantum-inspired evolutionary algorithm (QIEA): candidate feature subsets are encoded as qubit-style probability amplitudes that collapse, get evaluated, and update toward the best observed subsets. Logistic regression, random forest, and gradient-boosted trees are trained on the selected subset, and a Kafka-simulated stream demonstrates real-time inference end to end.
Results
QIEA kept AUC-ROC competitive with the full feature set while reducing dimensionality by 90% — which is what makes the latency budget reachable. [Replace with the exact AUC/latency table from results_summary.csv.]
What broke
[Convergence behaviour? Spark partitioning pain? What you'd change about the fitness function?]