Your cart is currently empty!
General Instructions Feel free to talk to other members of the class in doing the homework. You should, however, write down your solutions yourself. List the names of everyone you worked with at the top of your submission. Keep your solutions brief and clear. Please use Piazza if you have questions about…
General Instructions
Feel free to use private posts or come to the office hours.
Homework Submission
attempt to submit well in advance of the due date/time.
1 Query Execution (25 pts)
Consider the following relations with no indexes on them:
The number of blocks in memory is 10. Say, S is as large as possible (within the limits that main memory can afford) for the two pass sort-merge join (slide 57). That is, if S was larger, the two-pass sort-merge join would not work.
Answer the following questions:
A How many tuples per block does S have? (Do not forget to show your calculations.)
B Using your answer from A, what is the cost of joining R and S using the two pass sort-merge join algorithm (slide 57)?
C Using your answer from A, what is the cost of joining R and S using the optimized block nested-loop join algorithm?
D What is the cost of joining R and S using a two pass hash-based join?
E Based on questions 2, 3, and 4, explain which variant of the algorithm you would choose in terms of I/O cost. If multiple algorithms have the same I/O cost, explain other considerations that may influence your choice.
2 Query Optimization (35 pts)
Consider the relations A(x,y,z), B(w,x), and C(u,v,w), with the following properties:
A(x,y) | B(y,z) | C(z,x,u) |
T(A) = 2500
V(A, x) = 30 V(A, y) = 500 |
T(B) = 1000
V(B, y) = 250 V(B, z) = 100 |
T(C) = 6000
V(C, z) = 20 V(C, x) = 60 V(C, u) = 40 |
where, T(R) = number of tuples in relation R and V(R, a) = number of distinct values of attribute a in relation R. Estimate the sizes (measured in number of tuples) of the result of the following expressions:
3 Dynamic Programming (40 pts)
Consider the following relations, where T(R) = number of tuples in relation R and V(R, a)
= number of distinct values of attribute a in relation R.
A(x,y) | B(y,z) | C(z,x,u) | D(u,v) |
T(A) = 2500
V(A, x) = 30 V(A, y) = 200 |
T(B) = 1000
V(B, y) = 250 V(B, z) = 100 |
T(C) = 6000
V(C, z) = 20 V(C, x) = 60 V(C, u) = 40 |
T(D) = 2000
V(D, u) = 100 V(D, v) = 40 |
We want to join all these relations as efficiently as possible. Determine the most efficient way to do the join. Clearly state any assumptions you have made. Show your work by completing the following table (each step in the dynamic programming algorithm should be one row):
Subset | Size | Lowest Cost | Lowest cost plan |
. . . | . . . | . . . | . . . |