I need help working out this problem.
Assume that computer C implements the Larc architecture, which is a load-store architecture (or register-register architecture) in which both stores for all ALU operations come from registers.
Therefore, to add two data elements in memory, a typical program P, running on C, would look like this:
lw R1 <address1>
lw R2 <address2>
add R3 R1 R2
sw R3 <address3>
We measure the instruction mix on an average workload for C and obtain the percentages shown in the middle column of the following table:
Instruction type| Frequency| CPI
ALU ops| 40%| 1
Loads| 20%| 2
Stores| 15%| 3
Branches| 25%| 3
The last column of this table includes the clock cycle count per instruction for each type. During our measurements, we also notice that 25% of all ALU operations directly use a value loaded from memory that is not used again.
Therefore, we are considering building C2, a computer that implements a modified Larc architecture, called Larc2, that keeps all previous instructions but adds a new type of ALU operations for which one of the operands is in memory (both the other source operand and the result of the ALU operation are still stored in registers). These new register-memory ALU instructions have a CPI of 2. Furthermore, the new implementation increases the CPI of branch and load instructions by 1 (the CPI of other instructions is
not affected).
a. (10 points) Build a new table for Larc, similar to the table given above for Larc. To do so, you will have to compute the new frequencies in the average workload for all instruction types, given that some of the code fragments like the one above will be replaced by code fragments like this one:
lw R1 <address1>
newadd R3 R1 <address2>
sw R3 <address>
b. (5 points) By how much does the clock cycle time for C2 have to be improved (compared to the clock cycle time for C1) in order to make C2 have a higher performance than C1?
Hello Sir/Madam,
I have done several courses of software architecture, and also served as a TA in this course. Also I am familiar with different architectures, I simulated Patterson's architecture using VHDL and built Morris Mano's architecture on bread board for 5 instructions. You can have your solution in less than one day.
Hope to work with you
Iman Amirtaheri