Piggyback Query Optimization with Statistics Collection for Database Management Systems

Supported by the Centre for Advanced Studies at the IBM Toronto Laboratory
and The University of Michigan (Rackham, UMD Research, and CEEP)

Principal Investigator

Dr. Qiang Zhu
Department of Computer and Information Science,
The University of Michigan, Dearborn, MI 48128
qzhu@umich.edu

Co-Investigator

Dr. Nandit Soparkar
Department of Electrical Engineering and Computer Science
The University of Michigan, Ann Arbor, MI 48109
soparkar@umich.edu

Industrial Collaborators

Dr. Suyun Chen & Berni Schiefer
Database Technology
IBM Toronto Laboratory
suyun@ca.ibm.com

Graduate Students

Brian Dunkel, Ph.D. candidate, IBM CAS fellowship
Wing Lau, M.Sc. student
Wahyudi Gunawan, M.Sc. student
Jung-uk Kim, M.Sc. student

Undergraduate Student

Nikola Markovic

Project Overview

Most database management systems (DBMS) perform query optimization based on statistical information about the underlying database. Out-of-date statistics may lead to inefficient query processing in the system. Existing solutions to this problem have serious drawbacks such as heavy administrative burden, high system load, and tardy updates. To overcome these drawbacks, this project investigates a new approach, called the piggyback method, to solve the problem. The key idea is to piggyback some additional retrievals during the processing of a user query in order to collect more up-to-date statistics. The collected statistics are used to optimize the processing of subsequent queries.

There are several types of piggybacking including vertical piggybacking, which fetches additional columns; horizontal piggybacking, which retrieves additional rows; mixed vertical and horizontal piggybacking, which mixes the previous types; and multi-query piggybacking, which utilizes data from multiple queries. Basic piggybacking operators are defined to easily convert user queries into different types of piggybacked queries. Issues such as obtainable statistics, collecting levels, piggyback timing, parallel piggybacking, and a starvation problem are being investigated. The application of the piggyback technique to other database areas such as data mining is also being explored.

Project References

Qiang Zhu, Brian Dunkel, Wing Lau, Suyun Chen and Berni Schiefer, Piggyback Statistics Collection for Query Optimization: Towards a Self-Maintaining Database Management System, The Computer Journal, Vol. 47, No. 2, pp 218 - 241, Oxford, 2004.

Brian Dunkel, Qiang Zhu, Wing Lau, and Suyun Chen, Multiple-Granularity Interleaving for Piggyback Query Processing, Proceedings of CASCON'99, pp 24 - 39, 1999.

Qiang Zhu, Brian Dunkel, Nandit Soparkar, Suyun Chen, Berni Schiefer, and Tony Lai, A Piggyback Method to Collect Statistics for Query Optimization in Database Management Systems, Proceedings of CASCON'98, pp 67 - 82, 1998.

Qiang Zhu and P.-A. Larson, Global Query Processing and Optimization in the CORDS Multidatabase System, Proceedings of the 9th International Conference on Parallel and Distributed Computing Systems, pp 640-46, 1996

Qiang Zhu, An Integrated Method for Estimating Selectivities in a Multidatabase System, Proceedings of CASCON'93 Vol.II, pp 832-47, 1993

Qiang Zhu, N. Soparkar, Suyun Chen, Berni Schiefer, B. Dunkel and W. Lau, Piggyback Statistics Collection for Query Optimization: Towards an Auto-Maintaining Database Management System, Proceedings of UMD Technology Day 2002, pp 17 - 20, Dearborn, MI, June 2002.

Qiang Zhu, N. Soparkar, Suyun Chen, Berni Schiefer, B. Dunkel and W. Lau, A Piggyback Method for Database Statistics Collection: Towards a Maintenance-Free Database Management System, Proceedings of UMD Technology Day 2001, pp 11 - 14, Dearborn, MI, June 2001.

Qiang Zhu, N. Soparkar, Suyun Chen, Berni Schiefer, B. Dunkel, W. Lau, J.-U. Kim and W. Gunawan, Automating Statistics Collection for Query Optimization in Database Management Systems, Proceedings of UMD Technology Day 2000, pp 8 - 11, Dearborn, MI, June 2000.

Qiang Zhu, N. Soparkar, Suyun Chen, Berni Schiefer, B. Dunkel, W. Lau and N. Markovic, Piggyback Query Optimization with Statistics Collection for Database Management Systems, Proceedings of UMD Technology Day 1999, pp 9 - 11, Dearborn, MI, June 1999.