Paper 1 – TLDKS Journal

Role-Based Access Classification: Evaluating the Performance of Machine Learning Algorithms

Authors: Randy Julian, Edward Guyot, Shaowen Zhou, Geong Sen Poh, Stéphane Bressan

Volume 43 (2020)

Abstract

The analysis of relational database access for the purpose of audit and anomaly detection can be based on the classification of queries according to user roles. One such approach is DBSAFE, a database anomaly detection system, which uses a Naïve Bayes classifier to detect anomalous queries in Role-based Access Control (RBAC) environments. We propose to consider the usual machine learning algorithms for classification tasks: K-Nearest Neighbours, Random Forest, Support Vector Machine and Convolutional Neural Network, as alternatives to DBSAFE’s Naïve Bayes classifier. We identify the need for an effective representation of the input to the classifiers. We propose the utilisation of a query embedding mechanism with the classifiers. We comparatively and empirically evaluate the performance of different algorithms and variants with two benchmarks: the comprehensive off-the-shelf OLTP-Bench benchmark and a variant of the CH-benCHmark that we extended with hand-crafted user roles for database access classification. The empirical comparative evaluation shows clear benefits in the utilisation of the machine learning tools.

Keywords: Database, RBAC, Machine learning, Classification, DBSAFE, Benchmark, Query2Vec.