$30.00
Description
Problem 1 – Dependencies
Suppose we have the following program (a sequence of instructions), with each instruction labeled as
S<num>:

_{1} : a := 4;

_{2} : b := 2;

_{3} : c := 5;

_{4} : read ( d );

_{5} : a := b + 3;

_{6} : b := a – 3;

_{7} : c := d * b ;

_{8} : e := a + 6;

_{9} : print ( c );

_{10} : print ( e );


Give the statementlevel dependence graph for the above program. A node in the statementlevel dependence graph represents a statement, an edge represents dependence between the statements (i.e. the nodes). Label each edge as a true data dependence, an anti data dependence, or an output data dependence.



Assume that each statement takes 1 cycle to execute. What is the execution time of the sequential code? What is the fastest parallel execution time of the program (i.e. the critical path)? You may assume that I/O operations (read and print) can be done in parallel.

Problem 2 – Dependence Analysis
Give the source and sink references, the type (whether a dependence is true, anti, or output), and the distance vectors for all dependences in the following loops.

do i = 3, 100
a(i) = a(i – 1) + a(i + 1) – a(i – 2)
enddo

do i = 2, 100
a(3 * i) = a(3 * i – 3) + a(3 * i + 3)
enddo

do i = 1, 10
a(i) = a(5) + a(i)
enddo
Use a_{W} (i) and a_{R}(i) to annotate the write access to a(i) and the read access to a(i) respectively.
Problem 3 – Loop Parallelization
Given the following nested loop:
1
2 
, 
100 

do 
j 
= 
2 , 
100 
S_{1} : 
a (i , j ) = b ( i – 1 , j – 1) + 2 

_{2} :b (i , j ) = i + j – 1 enddo
enddo

Give the statementlevel dependence graph. Show the dependence graph for statement instances in a part of the iteration space: i = 2 … 5, j = 2 … 5.

In its current form, can any loop be parallelized? If so, which loop(s)? If not, justify your answer.

Provide one valid a ne schedule for statements S_{1} and S_{2} such that p(S_{1})= C_{11}*i + C_{12}*j + d_{1} and p(S_{2})= C_{21}*i + C_{22}*j + d_{2} in order to achieve synchronizationfree parallelism. There could be many possible solutions for fC_{11}; C_{12}; C_{21}; C_{22}; d_{1}; d_{2}g. You only need to provide one feasible solution. (Hint: You can let d_{1} = d_{2} = 0.)

Generate twolevel loop code for the a ne schedule you provided. Please use p as outermost loop and i as innermost loop in the transformed loop. Calculate the loop bounds for p and i using FourierMotzkin elimination. You might need to calculate the overlapping polyhedron for S_{1} and S_{2} in order to eliminate the j loop. Please refer to the techniques for code generation in lecture 20 and lecture 21.
2