cosc 2425 computer organization and architecture homework 4 chapter 4
Search for question
Question
COSC 2425: Computer Organization and Architecture
Homework 4 (Chapter 4)
Due: April 18th, 2024, 11:59 PM
Ques #
Points
Q1
20
Q2
20
Q3
20
Total
Grade
/60
Please use the solution sheet to submit your answers.
1. Submissions are only allowed in the format provided in the solution sheet.
2. The solution sheet is a word document (COA_HW4_Solution_Sheet.docx).
3. Download the solution sheet. The sheet has a table with question numbers.
4. Edit the word document by filling in your solution in the corresponding boxes in the
table.
5. Points are allocated both for final answer and your work/formula.
6. Please show your work clearly and how you arrived at the solution. Pipelining
Question 1: Consider an LEGv8 processor with pipelined implementation consisting of five
stages:
1. IF: Instruction Fetch
2.
ID: Instruction Decode
3. EXE: Execution
4. MEM: Memory access
5.
WB: Write Back
Write operations occur in the first half of the clock cycle and reads occur in the second half of
the clock cycle. The table below shows the pipelining diagram for three instructions.
Inst/CC
1
2
3
4
5
6
7
8
Inst. 1
IF
ID
EXE
MEM
WB
Stall
Inst. 2
Inst. 3
IF
ID
EXE
MEM
WB
IF
ID
EXE
MEM
WB
Where CC stands for clock cycle, stalls are shown as empty rows, and forwarding is indicated
between associated stages using an arrow.
LDUR X0, [X0, #0]
ADD X1, X0, XO
SUB X2, X1, X2
SUB X3, X2, X3
ADD X4, X3, X2
a. First assume forwarding is not available, show a similar table for the following sequence
of instructions and indicate stalls with empty rows. How many cycles are needed to
execute the above sequence of instructions?
b. This time, assume that forwarding is available. Show a similar table for the following
sequence of instructions and indicate stalls with empty rows and forwarding with
arrows. How many cycles are needed to execute the above sequence of instructions? Data Hazards
Question 2: Consider the following loop
1. LDUR X0, [X11, #0]
2.
LDUR X1, [X10, #4]
3.
SUB X3, XO, XO
4.
ADD X3, X1, X3
5.
ADD X4, X3, X1
1. List the true data dependencies (Read-after-write) in the above code. Use the number
next to the instruction to identify instructions. For example, If the Instruction with line
number x has a dependency on instruction with line number y. Then state that
"Instruction x on instruction y".
Note: List all dependencies, respective of whether they cause stalls in the pipeline or
not.
2. Assume 5-stage pipeline with no forwarding, and each stage takes 1 cycle.
a. Show the pipeline diagram for the instruction sequence.
b. Assuming that the processor stalls on a hazard. How many times does the processor
stall? How long is each stall (in cycles)? What is the execution time (in cycles) for
the whole program?
c. Assume the 5-stage pipeline with full forwarding. Show the pipeline diagram with
stalls if needed. What is the execution time (in cycles) for the whole program? Branch Prediction
Question 3: In the context of control hazards, branch prediction strategies are used to improve
performance. Given a branch, the following example shows the branch prediction table for
some strategy and the outcome of the branch. Predictor value of 1 indicates the prediction that
the branch will be taken and 0 indicates the prediction that the branch will not be taken.
Value of
register
Branch
predictor for
Branch b
Prediction
(T/NT)
Actual outcome Misprediction?
of branch bl
(yes/no)
1
0
NT
NT
No
2345
2
0
NT
T
Yes
3
1
T
NT
Yes
4
0
NT
NT
No
5
0
NT
NT
No
Branch predictor accuracy is defined as the percentage of predictions that are correct. In the
example above, the prediction is correct 3 out of 5 times (shown in bold). The accuracy can be
computed as shown below:
Accuracy = 3/5 = 0.6
Let's assume that A is an array stored in the main memory with values [6, 4, 3, 7, 1, 4, 9]
associated with array indices 0 through 6, respectively. Furthermore, assume that the base
address for array variable A is associated with register XO, and that of i and k are associated
with X1, X2, respectively. Consider the following LEGv8 code.
AND X1, X1, XZR
Loop: LSL X3, X1, #3
ADD X3, X0, X3
LDUR X4, [X3, #0]
SUBI X5, X4, #4
CBNZ X5, Cond
ADDI X2, X2, #1
Cond: SUBI X6, X1, #7
// Branch B1
CBZ X6, Exit
ADDI X1, X1,
#1
B Loop
Exit:
Consider the following strategies and show branch prediction strategies tables and accuracies
for branch B1 for each of the strategy A. Strategy 1 → Branch is always taken: In this, the strategy is to assume that the branch
will always be taken. For this strategy show the branch prediction table, comparing it
with the actual branch outcome and compute the branch predictor accuracy.
B. Strategy 2 → Branch is always taken: In this, the strategy is to assume that the branch
will never be taken. For this strategy show the branch prediction table, comparing it
with the actual branch outcome and compute the branch predictor accuracy.
C. Strategy 3 → 1-bit branch predictor: Consider a 1-bit prediction strategy as discussed
in class. If the initial value for the predictor is 0, then for this strategy show the branch
prediction table and compute the branch predictor accuracy.
D. Strategy 4 → 2-bit branch predictor: Consider a 2-bit prediction strategy as discussed
in class. If the initial value for the predictor is 01, then for this strategy show the branch
prediction table and compute the branch predictor accuracy./n