Query Optimization

Type:Lecture
Lecturer:Prof. Dr. Guido Moerkotte
Interval:Spring semester
Credit:6 ECTS (4SWS)
Time and place:Tues. 13:45-15:15, B 6 A 303
First time:12.02.2019

 Prerequisites: Knowledge in Database Systems and Algorithms

 

Query optimization is a function of many relational database management systems. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans.

A query is a request for information from a database. It can be as simple as "finding the address of a person with SS# 123-45-6789," or more complex like "finding the average salary of all the employed married men in California between the ages 30 to 39, that earn less than their wives."
Queries results are generated by accessing relevant database data and manipulating it in a way that yields the requested information. Since database structures are complex, in most cases, and especially for not-very-simple queries, the needed data for a query can be collected from a database by accessing it in different ways, through different data-structures, and in different orders.
Each different way typically requires different processing time. Processing times of the same query may have large variance, from a fraction of a second to hours, depending on the way selected.
The purpose of query optimization, which is an automated process, is to find the way to process a given query in minimum time. The large possible variance in time justifies performing query optimization, though finding the exact optimal way to execute a query, among all possibilities, is typically very complex, time consuming by itself, may be too costly, and often practically impossible. Thus query optimization typically tries to approximate the optimum by comparing several common-sense alternatives to provide in a reasonable time a "good enough" plan which typically does not deviate much from the best possible result.

In this lecture we discuss join ordering, usage of index structures, subquery unnesting and basic formulas for cost estimation.

 

  • Script - PDF   legacy: PDF 
  • Slides - PDF 
  • Extra slides - PDF

Übung

Lecturer:Magnus Müller
Time and place:Wed. 15:30-17:00, B6 A 304
First time:20.2.2019

 NOTE:
If you're interested in C++ programming I highly recommend attending the first exercise sessions in DBSII. Takes place in the time slot right before this exercise session.
If you are interested in Go programming, this 1h video and this tutorial might be a good entry point.

 

Exercise sheets:

 

 

Last year's excercise sheets (German):

  • Sheet 1   (pdf), Solution (pdf), Code (java)
  • Sheet 2   (pdf), Solution (pdf), Code (java), TinyDB-C++ Thomas Neumann TUM (tar.gz)
  • Sheet 3   (pdf), Solution (pdf), Join translation (tar.gz)
  • Sheet 4   (pdf), Solution (pdf), IKKBZ sample (pdf), GreedyJoin (zip), dp_examples (tar.gz)
  • Sheet 5   (pdf), Solution (pdf), GreedyJoin (zip), DPsize (pdf)
  • Sheet 6   (pdf), Solution (pdf), DPJoin (java)
  • Sheet 7   (pdf), Solution (pdf), Cheatsheet on Join Rewrites (pdf)
  • Sheet 8   (pdf), Solution (pdf), Skippy (C)
  • Sheet 9   (pdf), Solution (pdf), IndexSample (java)
  • Sheet 10 (pdf), Solution (pdf), SimpleBenchmark (java)

 

Additional information