CSC5120 Course Project Homepage

Survery on view materialization and indexing method for OLAP and warehouse maintenance problem

[home][intro][progress][document][links][people][FAQ]
INTRODUCTION

Data warehouse and on-line anlytical processing (OLAP) are essential elements of decision support, which has increasingly become focus of the database industry. Data warehouse collect data from source databases throughout an enterprise. After processing of such data, multidimensional views of data are then generated for a variety of OLAP servers to carry out OLAP queries.

Data warehouse is a place that stores a large amount data for analysis. To perform complex queries, joining of large tables, searching of data and computing result account for most of the time in OLAP queries. Since decision support is time critical, to be able to finish OLAP query in a reasonable time is important for analysts. This interests us to perform a survey in the area of improving query performance.

There are many issues in performance tuning in OLAP queries. In this survey, we will focus on speeding up OLAP queries using different view materialization methods. Since maintenance is an essential task in data warehouse, we will also discuss several issues and strategies involved in data warehouse maintenance.

In the study of view materialization methods, we will drill into various algorithms improving OLAP query performance, including different index and optimal view generation methods. Index methods include Bit-Sliced indexes, Projection indexes and join index hierarchies. As for optimal view generation methods, we will study greedy search, A* search, point intersection and genetic algorithm. Finally, we will study new data structures in the representation of data cube other than traditional lattice structure, such as cubetrees.

As for data maintenance, there are various main areas in this issue, including data cleaning, data extraction form source databases, data updating strategies and data warehouse self-maintainability. We will focus on warehouse consistency (data cleaning) and self-maintainability. After studying related method mentioned above, we will try to compare the boost in performance among various algorithms in OLAP queries. For example, comparing index types for evaluating aggregate functions in queries.


[home][intro][progress][document][links][people][FAQ]
Copyright 2001