Autoencoder-Based Anomaly Detection and Analysis in Log Data Generated in Cloud Systems Using Natural Language Processing

Loading...
Publication Logo

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Journal Issue

Abstract

In this study, an Autoencoder-based model was developed to detect anomalies in log data obtained from cloud systems. The dataset used consists of log records from the Blue Gene/L (BGL) supercomputer. In the preprocessing phase, log messages were vectorized using the TF-IDF method, and structural features such as content length, word count, and the presence of component/type information were extracted to create an enriched feature matrix. The model attempted to reconstruct each log entry and calculated the reconstruction errors. Records were then classified as normal or anomalous based on a threshold corresponding to the 95th percentile of these errors. The developed model achieved a high performance with an accuracy rate of 99.61%, as well as strong results in precision, recall, and F1-score metrics. Additional evaluations using ROCAUC and Precision-Recall curves further confirmed the model's robustness. The results demonstrate that the Autoencoder architecture can effectively detect anomalies in large and complex log datasets. Within the scope of the study, the proposed model was also evaluated comparatively against recent approaches such as DeepLog, LogRobust, MLP, and LogEvent2Vec. The proposed model outperformed all other methods across all performance metrics. These findings highlight the Autoencoder-based method as a strong alternative in terms of both computational efficiency and anomaly detection capability. © 2025 IEEE.

Description

Keywords

Anomaly Detection, Autoencoder, Cloud Logs, Natural Language Processing, Tf-Idf, Unstructured Data Analysis

Fields of Science

Citation

WoS Q

N/A

Scopus Q

N/A

Source

-- 9th International Symposium on Innovative Approaches in Smart Technologies, ISAS 2025 -- 2025-06-27 through 2025-06-28 -- Gaziantep -- 211342

Volume

Issue

Start Page

End Page

Google Scholar Logo
Google Scholar™

Sustainable Development Goals

SDG data could not be loaded because of an error. Please refresh the page or try again later.