Paul Irofti


About me: 
 Resume (RO)
 Publications
 Education
 Security Seminar
 ILDS
 ORCID
 Scholar
 LinkedIn
 GitHub

Grants: 
 DDNET
 Graphomaly
 NetAlert
 LEGAT
 DeDDoS

Teaching: 
 Sisteme de Operare
 Utilizarea SO
 OS Security
 Vedere Artificială
 Static Analysis
 Prelucrarea Semnalelor
 Calcul Numeric
 Anomaly Detection

Contact: 
 [E-mail address]

LEGAT – Advanced computer system based on artificial intelligence (AI) for identifying and extracting entities from unstructured data collections

News

2024

Project

Main Objective
The main objective of this project is the creation of a hardware-software IT system, called LEGAT, based on artificial intelligence, which, based on certain training data sets, will proceed to the semi-automatic structuring of the historical data collected at the MAI/DGPI level based on on a number of essential components:

  1. extracting data from unstructured datasets;
  2. characterization of the entities and the links between them;
  3. identifying patterns and retrieving the information of the entities in focus.

End Result
Prototype hardware-software platform delivered to the Beneficiary at the end of the project.

Team

Paul Irofti -- Project Coordinator

University of Bucharest:

Paul Irofti -- Principal Investigator
Radu Ionescu -- Senior Researcher
Marius Popescu -- Senior Researcher
Iulia Timofte -- Researcher
Roxana Voicu -- Researcher
Eduard Poesina -- Assistant Researcher
Ana Cristina Rogoz -- Assistant Researcher
Silviu Gheorghe -- Master Student

Open Positions: 1 Researcher position, 2 PhD or Masters student positions.
Contact me if interested!

Military Technical Academy

Luciana Morogan -- Principal Investigator
Ion Bica -- Senior Researcher
Ștefan-Adrian Toma -- Senior Researcher
Mihai Coca -- Researcher
Iulian Tiță -- Assistant Researcher
Mirabela Medvei -- Assistant Researcher
George Hariga -- Assistant Researcher
Alexandra Buzățoiu -- Master Student
Paul-Florinel Căsăndroiu -- Master Student
Ilie-Cosmin Bilțan -- Master Student
Florina Conchințoiu -- L1 Technician
Andrei Brînzea -- L1 Technician

Nextgen Software SRL:

Bogdan Legănaru -- Principal Investigator
Vlad Gladin -- Senior Researcher
Daniel Tache -- Researcher
Alexandru Cocosila -- Researcher
Viorel Tiganescu -- L2 Technology Engineer
Bonciu Emilian Cristian -- L2 Technology Engineer
Adrian Bogdan Sandu -- L2 Technician

Documentation

Papers

About

LEGAT aims to create a hardware-software computer system based on artificial intelligence, which, based on training data sets, will proceed to the semi-automatic structuring of the historical data collected at the MAI/DGPI level based on a series of essential components:

(i) extracting data from unstructured data sets

Training large language models (LLM) for a high degree of accuracy and efficiency starting from pre-trained models, which we will adapt to our data sets, using effective training techniques such as Low-Rank Adaptation (LoRA), Direct Preference Optimization (DPO) or combinations thereof. To maximize performance, we will manually annotate the data and fit LLM models in a supervised manner.

(ii) characterization of the entities and the links between them;

Starting from the latent representation obtained by the language model trained by our team, we will add a module consisting of neural layers for entity extraction that will have a classification layer that assigns to each language token a class representing an entity type or a class representing simple words (non-entities). We will use a module with a similar architecture for identifying and finding attributes for entities. A third neural module will be used to extract relationships.

(iii) identifying patterns and retrieving the information of the entities in focus

The resulting data will be inserted into dedicated database tables. Also, the metadata of the processed file will be inserted as well as other information considered necessary for the creation and identification of predefined topics. The web interface module will value the entities presented in the database with multiple query criteria: links or property values. Thus, users can investigate entities, links between entities and a map of links between them.