Graphomaly – software package for anomaly detection in graphs modeling financial transactions
News
2022
-
[Aug'22] -- Presented the GSI-DL paper at CCTA'22.
-
[May'22] -- Presented
Uniform Support DL Anomaly Detection
at ICASSP'22.
-
[Apr'22] -- Final report submitted, all objectives met!
-
[Mar'22] -- Graphomaly
open-source software package
released
-
[Feb'22] -- Working on public software package; finalizing unsupervised grid-search
module with voting support.
-
[Jan'22] -- Uniform Support DL Anomaly Detection paper preprint available.
2021
-
[Dec'21] -- Released dictlearn!
Presented
Efficient and Parallel Separable Dictionary Learning
at ICPADS'21.
-
[Nov'21] -- Visited Vicenç Puig at IRI-UPC
and finished the GSI-DL paper.
-
[Oct'21] -- Submitted paper to ICASSP'22.
-
[Sep'21] -- ICPADS paper accepted!
-
[Aug'21] -- Holder paper preprint available.
-
[Jul'21] -- Submitted paper to ICPADS'21.
-
[Apr'21] -- Excellent results reported on the
BRD Toys dataset
-
[Mar'21] -- Sent out initial batch of results for
BRD Toys dataset
and
Libra Bank dataset
-
[Feb'21] -- Analysis and testing on
NAD Challenge dataset
provided by
ZYELL group
and National Chiao Tung University
at IEEE ICASSP 2021
-
[Jan'21] -- Analysis and preprocessing on
Libra Bank dataset
2020
-
[Dec'20] -- New partnership with
Libra Bank
-
[Nov'20] -- Framework iterators for
PyOD,
TODS,
Deep-OCSVM, and Deep-SVDD
-
[Oct'20] -- Initial framework and experiments on the
BRD Toys dataset
-
[Sep'20] -- Established data format and explored community detection techniques
Project
-
Project ID: PN-III-P2-2.1-PED-2019-3248
-
Consortium: UPB (coordinator),
UB (partner),
Tremend Software Consulting SRL
(partner).
-
Team: 4 positions at UB (10 positions total)
-
Funder:
UEFISCDI
-
Budget: 228.925 lei UB (669.825 lei total)
-
Duration: 03 August 2020 - 28 April 2022
Team (University of Bucharest)
Paul Irofti -- Principal Investigator
Andrei Pătrașcu -- Senior Researcher
Marius Popescu -- Senior Researcher
Andra Băltoiu -- Research Assistant
Coordinator
Bogdan Dumitrescu
(
University Politehnica of Bucharest),
Industry Partener
Ioan Cocan
(
Tremend).
Papers
[1]
|
C. Rusu and P. Irofti,
“Efficient and Parallel Separable Dictionary Learning,”
in Proceedings of the IEEE 2021 27th International Conference on
Parallel and Distributed Systems (ICPADS). 2021, pp. 1--6, IEEE Computer
Society.
[ bib |
http ]
|
[2]
|
A. Pătrașcu and P. Irofti,
“Computational complexity of Inexact Proximal Point Algorithm for
Convex Optimization under Holderian Growth,”
pp. 1--42, 2021.
[ bib |
arXiv ]
|
[3]
|
P. Irofti, L. Romero-Ben, F. Stoican, and V. Puig,
“Data-driven Leak Localization in Water Distribution Networks via
Dictionary Learning and Graph-based Interpolation,”
2021, pp. 1--6.
[ bib |
arXiv ]
|
[4]
|
P. Irofti, C. Rusu, and A. Pătrașcu,
“Dictionary Learning with Uniform Sparse Representations for Anomaly
Detection,”
2021, pp. 1--6.
[ bib |
arXiv ]
|
Software
Graphomaly Framework
(
source)
(
documentation)
(
pypi).
Python Dictionary Learning Toolbox
(
source)
(
documentation)
(
pypi).
Algorithms:
About
The proposed project, called Graphomaly, aims to create a Python
software package for anomaly detection in graphs that model financial
transactions, with the purpose of discovering fraudulent behavior like
money laundering, illegal networks, tax evasion, scams, etc. Such a
toolbox is necessary in banks, where fraud detection departments still
use mostly human experts.
The main tool that we propose is dictionary learning for sparse
representations, which will be used to model sub-graphs derived from
the full transactions graph through community detection. Other
machine learning tools will be used for comparison, together with a set
of data processing tools that are customary for dimensionality
reduction.
There are two main working scenarios. In one, fraud patterns are
known, but their shape can vary in size and also can be affected by
other activities. In the other, unsupervised learning is used for the
detection of anomalies, possibly of new types, that may be related to
frauds.
The implemented methods will be able to process large graphs. Online
and distributed forms of the algorithms will be derived, such that
reaction time is decreased and thus frauds can be discovered in their
incipient stages.
The consortium is made of two universities and a software firm and has
the support of a bank that will provide relevant transactions data and
will directly validate some of the results. The team members have
relevant expertise in dictionary learning and related techniques,
software architecture, data management and processing.