PLASTIC: PLAn Selection Through Incremental Clustering

Database Systems Lab
Indian Institute of Science



[About] [Publications] [Demonstration] [People] [Contact] [Acknowledgements]

About PLASTIC

Query optimization is a computationally intensive process, especially for the complex queries that are typical in current data warehousing and mining applications. We have recently proposed a tool called PLASTIC (PLAn Selection Through Incremental Clustering) that helps query optimizers significantly amortize these overheads through a technique of "plan recycling". Specifically, PLASTIC groups similar queries into clusters and uses the optimizer-generated plan for the cluster representative to execute all future queries assigned to the cluster. Queries are characterized by a feature vector that captures both query structures and statistics, and query similarity is evaluated using a distance function on these feature vectors.

PLASTIC is capable of recycling plans even across queries that differ in projection, selection and join predicates, as also in the base tables themselves, if these queries happen to have identical "plan templates" -- that is, they share a common database operator tree, although the specific inputs to these operators could be different. By identifying such similarities in the plan space, PLASTIC materially improves the utility of plan cacheing.

Experiments with a variety of queries (based on the TPC-H benchmark) on commercial optimizers show that PLASTIC predicts the correct plan choice in most cases, thereby providing significantly improved query optimization times. Further, even when errors are made, only marginal additional execution costs are typically incurred due to the sub-optimal plan choices.

Apart from the obvious advantage of speeding up optimization time, PLASTIC also improves query execution efficiency because optimizers can now always run at their highest optimization level -- the cost of such optimization is amortized over all future queries that reuse these plans. Further, the benefits of "plan hints", a common technique for influencing optimizer plan choices for specific queries, automatically percolate to the entire set of queries that are associated with this plan. Lastly, and most importantly, since the assignment of queries to clusters is based on database statistics, the plan choice for a given query is adaptive to the current state of the database.

Click here for a schematic of the PLASTIC-integrated query optimizer architecture

Publications

Green Query Optimization: Taming Query Optimization Overheads through Plan Recycling (demo)
P. Sarda and J. Haritsa
Proc. of 30th Intl. Conf. on Very Large Data Bases (VLDB), Toronto, Canada, September 2004, pgs. 1333-1336

PLASTIC: Reducing Query Optimization Overheads through Plan Recycling (demo)
V. Sengar and J. Haritsa
Proc. of ACM SIGMOD Intl. Conf. on Management of Data, San Diego, California, USA, June 2003, pg. 676

Plan Selection based on Query Clustering
A. Ghosh, J. Parikh, V. Sengar and J. Haritsa
Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), Hong Kong, China, August 2002, pgs. 179-190

Reports and Theses

Green Query Optimization: Taming Query Optimization Overheads through Plan Recycling
Sarda Parag Kachurlal
Master's Thesis, July 2004

Reducing Query Optimization Overheads through Plan Recycling
Vibhuti Singh Sengar
Master's Thesis, March 2003

Plastic (in German) (PPT)
Elena Hensinger
U. of Hannover, Germany

Demonstration

Download avi file from here

People Involved

  • Jayant Haritsa
  • Gopal Das
  • Mohammed Aslam
      Alumni:
  • Naveen Reddy
  • Parag Sarda
  • Vibhuti Sengar
  • Jignashu Parikh
  • Antara Ghosh

Contact

Email: haritsa [AT] dsl [dot] serc [dot] iisc [dot] ernet [dot] in

Acknowledgments

This work is supported in part by a Swarnajayanti Fellowship from the Dept. of Science & Technology, Govt. of India., and by a research grant from the Dept. of Bio-technology, Govt. of India.