A processor has two levels of caches A 2way setassociative L

A processor has two levels of caches:

A 2-way set-associative L1 cache that can house 4 blocks in total. The access latency

to this cache is 1 cycle.

A 16-way set-associative L2 cache that can house 128 blocks in total. The access latency

to this cache is 20 cycles.

Both L1 and L2 caches employ LRU replacement policy. The processor does not employ any prefetching mechanism.

A programmer writes a test program that repeatedly accesses only the following data cache blocks in a loop (assume billions of iterations are executed):

A, B, C, D, E, F

where A, . . ., F are different cache block addresses.

In the steady state (i.e., after the loop has executed for a large number of iterations), the programmer observes that the average memory access time (AMAT) is 1 cycle.

Then, the programmer writes another program that repeatedly accesses only the following

data cache blocks in a loop:

A, B, C, D, E, F, G, H

In the steady state, the programmer observes that the AMAT is 20 cycles.

Question 3.A (3 points)

Why do the two programs have different AMAT? Please explain.

Question 3.B (3 points)

Based on the above information, what do you expect the average memory access time of yet another program that repeatedly accesses only the following data cache blocks in a loop?

A, B, C, D, E

Please explain.

Question 3.C

Again, based on the above information, what do you expect the average memory access time

of yet another program that repeatedly accesses only the following data cache blocks in a

loop?

A, B, C, D, E, F, G

Please explain.

Question 3.D (3 points)

Finally, based on the above information, what do you expect the average memory access time

of yet another program that repeatedly accesses only the following data cache blocks in a

loop?

A, B, C, D, E, F, G, H, I

Please explain.

Solution

a)
3.A)The L1 cache can hold only 4 blocks. Therefore, a victim cache with an access latency of
1 cycle is present. This victim cache is accessed in parallel with the access to L1 cache.
Cache blocks evicted from the L1 cache are inserted into the victim cache. Based on
the access latencies from above, the victim cache can hold either 2 or 3 cache blocks.

3.B)1 cycle. The 5 cache blocks from above can all fit in the L1 cache and victim cache, and
thus all cache block accesses are hits (in the steady state).


3.C)Either 1 or 20 cycles.
If the victim cache holds 2 blocks, the 7 cache blocks from above cannot all fit in the
two caches, resulting in all accesses being misses.
If the victim cache holds 3 blocks, the 7 cache blocks from above can all fit in the two
caches, resulting in all accesses being hits.

3.D)20 cycles. It is impossible for all 9 cache blocks to fit in the two caches, so all accesses
are misses in the steady state.

A processor has two levels of caches: A 2-way set-associative L1 cache that can house 4 blocks in total. The access latency to this cache is 1 cycle. A 16-way s
A processor has two levels of caches: A 2-way set-associative L1 cache that can house 4 blocks in total. The access latency to this cache is 1 cycle. A 16-way s

Get Help Now

Submit a Take Down Notice

Tutor
Tutor: Dr Jack
Most rated tutor on our site