Individual stages of the datapath have the following latenci
Individual stages of the datapath have the following latencies:
Also, assume that instructions executed by the processor are broken down as
1) What is the clock cycle time in a pipelined and non-pipelined processor?
2) What is the total latency of an LW instruction in a pipelined and non-pipelined processor?
3) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor?
4) Assuming there are no stalls or hazards, what is the utilization of the data memory?
5) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the Register unit?
6 Instead of a single-cycle organization, we can use a multi-cycle organization where each instruction takes multiple cycles but one instruction finishes before another is fetched. In this organization, an instruction only goes through stages it actually needs (e.g., SW only takes 4 cycles because it does not need the WB stage). Compare clock cycle times and execution times with single cycle, multi-cycle, and pipelined organization.
| IF | ID | EX | Mem | WB |
| 250ps | 350ps | 150ps | 300ps | 200ps |
Solution
1. Pilpelined processor : In this pipelining to 5 stages reduces the cycle time to the length of the longest stage.Additionally the cycle time needs to be slightly longer to accomodate the register at the end of the stage.
Cycle time = 350 + 20=370ps
Non pipelined processor : In this there is no pipelining, the cycle time must allow an instruction to go through all stages in one cycle.
Cycle time = 250 + 350 + 150 + 300 + 200 =1250ps
2. The latency in Pipelined processor is 5 * (Cycle time)
so Latency = 5 * 1250 = 6250 ps
The latency in Non-pipelined processor is same as cycle time.
so latency = 1250ps
3. We have to choose the longest stage to split in half. The new cycle time becomes the originally 2nd longest stage length. Calculate latency and throughput , but remember there are now 6 stages instead of 5.
So old longest stage is ID. New Cycle time = 320ps
Latency = 6*(CT)=6*320=1920ps
Throughput=1/CT = 1/320 inst/ps
4. LW and SW instructions use the data memory. As a result, the utilization of the data memory is
25% + 15% = 35%.
5. ALU and LW instructions use the register block\'s write port. As a result the utilization of the register block\'s write port is 45% + 20% = 65%
Thank you for asking CHEGG.
