Paul Irofti | Graphomaly

Graphomaly – software package for anomaly detection in graphs modeling financial transactions

News

2022

[Aug'22] -- Presented the GSI-DL paper at CCTA'22.
[May'22] -- Presented Uniform Support DL Anomaly Detection at ICASSP'22.
[Apr'22] -- Final report submitted, all objectives met!
[Mar'22] -- Graphomaly open-source software package released
[Feb'22] -- Working on public software package; finalizing unsupervised grid-search module with voting support.
[Jan'22] -- Uniform Support DL Anomaly Detection paper preprint available.

2021

[Dec'21] -- Released dictlearn! Presented Efficient and Parallel Separable Dictionary Learning at ICPADS'21.
[Nov'21] -- Visited Vicenç Puig at IRI-UPC and finished the GSI-DL paper.
[Oct'21] -- Submitted paper to ICASSP'22.
[Sep'21] -- ICPADS paper accepted!
[Aug'21] -- Holder paper preprint available.
[Jul'21] -- Submitted paper to ICPADS'21.
[Apr'21] -- Excellent results reported on the BRD Toys dataset
[Mar'21] -- Sent out initial batch of results for BRD Toys dataset and Libra Bank dataset
[Feb'21] -- Analysis and testing on NAD Challenge dataset provided by ZYELL group and National Chiao Tung University at IEEE ICASSP 2021
[Jan'21] -- Analysis and preprocessing on Libra Bank dataset

2020

[Dec'20] -- New partnership with Libra Bank
[Nov'20] -- Framework iterators for PyOD, TODS, Deep-OCSVM, and Deep-SVDD
[Oct'20] -- Initial framework and experiments on the BRD Toys dataset
[Sep'20] -- Established data format and explored community detection techniques

Project

Project ID: PN-III-P2-2.1-PED-2019-3248
Consortium: UPB (coordinator), UB (partner), Tremend Software Consulting SRL (partner).
Team: 4 positions at UB (10 positions total)
Funder: UEFISCDI
Budget: 228.925 lei UB (669.825 lei total)
Duration: 03 August 2020 - 28 April 2022

Team (University of Bucharest)

Paul Irofti -- Principal Investigator
Andrei Pătrașcu -- Senior Researcher
Marius Popescu -- Senior Researcher
Andra Băltoiu -- Research Assistant

Coordinator Bogdan Dumitrescu (University Politehnica of Bucharest),
Industry Partener Ioan Cocan (Tremend ).

Papers

[1]	C. Rusu and P. Irofti, “Efficient and Parallel Separable Dictionary Learning,” in Proceedings of the IEEE 2021 27th International Conference on Parallel and Distributed Systems (ICPADS). 2021, pp. 1--6, IEEE Computer Society. [ bib \| http ]
[2]	A. Pătrașcu and P. Irofti, “Computational complexity of Inexact Proximal Point Algorithm for Convex Optimization under Holderian Growth,” pp. 1--42, 2021. [ bib \| arXiv ]
[3]	P. Irofti, L. Romero-Ben, F. Stoican, and V. Puig, “Data-driven Leak Localization in Water Distribution Networks via Dictionary Learning and Graph-based Interpolation,” 2021, pp. 1--6. [ bib \| arXiv ]
[4]	P. Irofti, C. Rusu, and A. Pătrașcu, “Dictionary Learning with Uniform Sparse Representations for Anomaly Detection,” 2021, pp. 1--6. [ bib \| arXiv ]

Software

Graphomaly Framework (source) (documentation) (pypi).

Python Dictionary Learning Toolbox (source) (documentation) (pypi).

Algorithms:

About

The proposed project, called Graphomaly, aims to create a Python software package for anomaly detection in graphs that model financial transactions, with the purpose of discovering fraudulent behavior like money laundering, illegal networks, tax evasion, scams, etc. Such a toolbox is necessary in banks, where fraud detection departments still use mostly human experts.

The main tool that we propose is dictionary learning for sparse representations, which will be used to model sub-graphs derived from the full transactions graph through community detection. Other machine learning tools will be used for comparison, together with a set of data processing tools that are customary for dimensionality reduction.

There are two main working scenarios. In one, fraud patterns are known, but their shape can vary in size and also can be affected by other activities. In the other, unsupervised learning is used for the detection of anomalies, possibly of new types, that may be related to frauds.

The implemented methods will be able to process large graphs. Online and distributed forms of the algorithms will be derived, such that reaction time is decreased and thus frauds can be discovered in their incipient stages.

The consortium is made of two universities and a software firm and has the support of a bank that will provide relevant transactions data and will directly validate some of the results. The team members have relevant expertise in dictionary learning and related techniques, software architecture, data management and processing.