Paper 4

A Pattern-based Approach for Efficient Query Processing over RDF Data

Authors: Yuan Tian, Haofen Wang, Wei Jin, Yuan Ni, and Yong Yu

Volume 5 (2012)

Abstract

The recent prevalence of Linked Data attracts research inter- est towards the efficiency of query execution over the web of data. Search and query engines crawl and index triples into a centralized repository and queries are executed locally. It has been shown in various literatures that the performance bottleneck of large scale query execution lies in joins and unions. Based on the observation that a large part of join op- erations result in a much smaller binding set which can be precomputed and stored, we propose to augment RDF indexes to store the bindings of complex patterns and exploit these patterns to enhance performance. In addition to the index, we also introduce two strategies of selecting these patterns: one depends on developed heuristic rules and the other employs query history to optimize time-space ratio. Our empirical study demonstrates the proposed pattern index outperforms traditional triple index by up to three orders of magnitude while keeping the overhead low.