Paper 3

Parallel Learning Algorithms of Local Support Vector Regression for Dealing with Large Datasets

Authors: Thanh-Nghi Do, Le-Diem Bui

Volume 41 (2019)

Abstract

New parallel algorithms of local support vector regression (local SVR), called kSVR, krSVR are proposed in this paper to efficiently handle the prediction task for large datasets. The learning strategy of kSVR performs the regression task with two main steps. The first one is to partition the training data into k clusters, followed which the second one is to learn the SVR model from each cluster to predict the data locally in the parallel way on multi-core computers. The krSVR learning algorithm trains an ensemble of T random kSVR models for improving the generalization capacity of the kSVR alone. The performance analysis in terms of the algorithmic complexity and the generalization capacity illustrates that our kSVR and krSVR algorithms are faster than the standard SVR for the non-linear regression on large datasets while maintaining the high correctness in the prediction. The numerical test results on five large datasets from UCI repository showed that proposed kSVR and krSVR algorithms are efficient compared to the standard SVR. Typically, the average training time of kSVR and krSVR are 183.5 and 43.3 times faster than the standard SVR; kSVR and krSVR also improve 62.10%, 63.70% of the relative prediction correctness compared to the standard SVR, respectively.

Keywords: Support vector regression (SVR), Local support vector regression (local SVR), Ensemble learning Large datasets.