Consider a Computer system with a single core CPU, a single level of cache of size $4 \text{MB}$, and main memory. It takes one CPU cycle to access a memory byte if it is in cache, and $145$ cycles if the memory access incurs a cache miss and must be fetched from main memory. The size of the cache line is $64$ bytes. Consider two arrays $A$ and $B$, each of $N=2^{20}$ integers (assume that an integer requires $4$ bytes of storage). The arrays are stored contiguously in memory, and are aligned at cache line boundaries. The below code shows an access pattern of the arrays $A$ and $B$. Calculate the average time (in CPU cycles) required to access a single element of array $A$ (averaged overall accesses to $A$). Assume that the cache is empty at the start of every scenario, and no other process is using the cache and the cache does not use any optimizations like prefetching.
A direct mapped cache, and every element of $A$ and $B$ is read in sequence as follows:
for (i=0;i<N;i++)
{
read A[i];
read B{i];
}