Paper 4

Semantic Attack on Anonymised Transactions

Authors: Jianhua Shao, Hoang Ong

Volume 23 (2015)

Abstract

A transaction is a data record that contains items associated with an individual. For example, a set of movies rated by an individual form a transaction. Transaction data are important to applications such as marketing analysis and medical studies, but they may contain sensi- tive information about individuals which must be sanitised before being used. One popular approach to anonymising transaction data is set-based generalisation, which attempts to hide an original item by replacing it with a set of items. In this paper, we study how well this method can protect transaction data. We propose an attack that aims to reconstruct original transaction data from its set-generalised version by analysing se- mantic relationships that exist among the items. Our experiments show that set-based generalisation may not provide adequate protection for transaction data, and about 50% of the items added to the transactions during generalisation can be detected by our method with a precision greater than 80%.