The purpose of the following assignment is to explore how indexes can be used to improve the performance of SQL queries, the di erent types of indexes supported by Postgres, and the issues involved around indexes and the optimizer.
Having read chapter 8 in the textbook will help you understand many aspects of this lab.
For this assignment you will need to download and unzip lab7.zip from Google Drive. Execute the following commands to setup your environment.
s o u r c e . / s t a r t P o s t g r e S Q L . sh
s o u r c e . / c r e a t e P o s t g r e D B . sh
Write the following SQL queries in queries.sql both with and without using indexes:
Count how many parts in NYC have more than 70 parts on hand
Count how many total parts on hand, in both NYC and SFO, are Red
List all the suppliers that have more total on hand parts in NYC than they do in SFO.
List all suppliers that supply parts in NYC that aren’t supplied by anyone in SFO.
Update all of the NYC on hand values to on hand – 10.
Delete all parts from NYC which have less than 30 parts on hand.
Some notes about the data model and the questions:
part number is the primary key for each part table. But it is not unique across both tables.
If a part has the same number in NYC and SFO it is the same part, regardless of color, etc.
If I say, e.g. \Red parts”, I mean color name = \Red” not color = 0. Di erent suppliers may supply the same part in NYC and SFO.
If you want to measure execution time of individual query you need to execute \ntiming” command in interactive psql terminal. I have provided you with a measure.sh script, which measures this time for you. Execution time can vary depending on the multiple factors, hence be sure that you re-port average time across several executions.
Write all your queries in queries.sql le. File create indexes.sql should contain create statements for all the indexes you decide are best for the queries you have written.
Notice that the last two questions are changing the data. To make the experiment reasonable, make sure that your queries run on the same data, with and without indexes.
You should experiment with B-tree indexes, built on di erent columns. Create a lab7.txt le and record the execution time of the queries before the index creation and after index creation. Use the following syntax for create index statement:
CREATE INDEX index name
ON t a b l e n a m e
( i n d e x c o l n a m e )
You can nd the Postgres documentation on indexes at http://www.postgresql.org/docs/8.1/static/indexes.html