consider the 5 instruction loop code in the table below where cc means clock cycle n 5195935
Consider the 5-instruction loop code in the table below, where CC means clock cycle, #n denotes an integer constant, Rx names an integer register, and Fx is a double precision floating point register. ADD.D Fx, Fy, Fz is defined as Fx
Using the latency information from the first table, fill in the “Given code showing stalls” column in the table below. Do not show any stall(s) that may occur after the branch. To further simplify your answer, please show only the instruction mnemonics and the stalls.
Assume that the loop executes an even number of times, and can thus be unrolled two times. Fill in the “Unrolled 2x …” columns of the table below. Show this code in complete detail; omit nothing. Again, ignore any stalls after the branch.
Given code |
CC |
A. Given code showing stalls and instruction mnemonic only |
B. Unrolled 2x and scheduled to reduce or eliminate stalls |
Loop: LOAD.D F2, R1, #0 |
1 |
||
ADD.D F2, F2, F0 |
2 |
||
STORE.D F2, R1, #0 |
3 |
||
ADDI R1, R1, # –8 |
4 |
||
BNE R1, #0, Loop |
5 |
||
6 |
|||
7 |
|||
8 |
|||
9 |
|||
10 |
|||
11 |