Number of cache accesses when miss occurs

Number of cache accesses when miss occurs - caching

In case of a cache miss, the cache is accessed a second time, when the data will be available.
If I want to compute the miss-rate, do I have to consider this second access in the total accesses to the cache?
Because it leads to two possible scenarios:
Second access is considered: 1 access + 1 miss + 1 access -> miss-rate = 0.5
Second access is NOT considered: 1 access + 1 miss -> miss-rate = 1

Related

How to validate increased performance using by Centralized Cache Management in HDFS

(On Single machine)
I installed Hadoop 2.4.1. And write a program for read a sequence file with 28.6 MB, Iterate this program 10,000 time.
Then Get result:
Without Centralized Cache
Run Time(in ms)
1 19840
2 15096
3 14091
4 14222
5 14576
With Centralized Cache
Run Time(in ms)
1 19158
2 14649
3 14461
4 14302
5 14715
And I also write a Map-reduce Job and iterate it 25 times
Result:
Without Centralized Cache
Run Time(in ms)
1 909265
2 922750
3 898311
With Centralized Cache
Run Time(in ms)
1 898550
2 897663
3 926033
Not found Main difference between performance using Centralized Cache and without.
How to Analysis Increase performance using Centralized Cache?
Suggest any other way to find Increase performance using Centralized Cache.

Comparison of MFU and LRU page replacement algorithms

When does MFU (Most Frequently Used) page replacement algorithm have better performance than LRU (Least Frequently Used)? When is it worse than LRU?
Where can I find information beyond the basic definition of the MFU page replacement algorithm?

Typically, I've seen an MFU cache used as the primary, backed by a secondary cache that uses an LRU replacement algorithm (an MRU cache). The idea is that the most recently used things will remain in the primary cache, giving very quick access. This reduces the "churn" that you see in an MRU cache when a small number of items are used very frequently. It also prevents those commonly used items from being evicted from the cache just because they haven't been used for a while.
MFU works well if you have a small number of items that are referenced very frequently, and a large number of items that are referenced infrequently. A typical desktop user, for example, might have three or four programs that he uses many times a day, and hundreds of programs that he uses very infrequently. If you wanted to improve his experience by caching in memory programs so that they will start quickly, you're better off caching those things that he uses very frequently.
On the other hand, if you have a large number of items that are referenced essentially randomly, or some items are accessed slightly more often than, or items are typically referenced in batches (i.e. item A is accessed many times over a short period, and then not at all), then an LRU cache eviction scheme will likely be better.

Least Recently Used (LRU) Page Replacement Algorithm
In this algorithm, the page that has not been used for the longest period of time has to be replaced.
Advantages of LRU Page Replacement Algorithm:
It is amenable to full statistical analysis.
Never suffers from Belady’s anomaly.
Most Frequency (MFU) Used Page Replacement Algorithm
Actually MFU algorithm thinks that the page which was used most frequently will not be needed immediately so it will replace the MFU page
Example: consider the following reference string:7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
Buffer size:3
String :7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 0 4 2 2 0 0 2 2 2 0 0 7 7 7
0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I had been struggling to find a use case for MFU, everywhere MFU is confused with MRU. The most cited use case of MFU is:
The most frequently used (MFU) page-replacement algorithm is based on
the argument that the page with the smallest count was probably just
brought in and has yet to be used.
But it can clearly be understood that they are talking about MRU - most recently used cache.
What I could find was a paper which described using both MFU and LFU, most frequently used references are moved to primary cache for faster access and least frequently used references are moved to secondary cache. That's the only use case I could find for MFU.

Windows Server Appfabric Caching Timeouts

We have an application that uses Windows Server AppFabric Caching. The cache is on the local machine, local cache is not enabled. Here is the configuration in code, none in .config.
DataCacheFactoryConfiguration configuration= new DataCacheFactoryConfiguration();
configuration.Servers= servers;
configuration.MaxConnectionsToServer= 100; // 100 is maximum
configuration.RequestTimeout= TimeSpan.FromMilliseconds( 1000);
Object expiration on PutAndUnLock is two minutes.
Here are some typical performance monitor values:
Total Data Size Bytes 700MB
Total GetAndLock Requests /sec Average 4
Total Eviction Runs: 0
Total Eviced Objects: 0
Total Object COunt: either 0 or 1.8447e+019 (suspicious, eh?) I think the active object count should be about 500.
This is running on a virtual machine, I don't think we are hardware constrained at all.
The problem: every few minutes, varies from 1 to 20, for a period of one second or so, all requests (Get, GetAndLock, Put, PutAndLock) timeout.
The only remedy I've seen online is to increase RequestTimeout. If we increase to 2 seconds the problem seems to happen somewhat less frequently, but still occurs. We can't increase the timeout more because we need the time to create the object from scratch after the cache times out.

Time Complexity/Cost of External Merge Sort

I got this from a link which talks about external merge sort.
From slide 6 Example: with 5 buffer pages, to sort 108 page file
Pass0: [108/5] = 22 sorted runs of 5 pages each (last run only with 3 pages)
Pass1 [22/4] = 6 sorted runs of 20 pages each (last run only with 8 pages)
Pass2: [6/3] = 2 sorted runs, 80 pages and 28 pages
Pass 3: [2/2] = 1 Sorted file of 108 pages
Question: My understanding is in external merge sort, in pass 0 you create chunks and then sort each chunk. In remaining passes you keep merging them.
So, applying that to the above example, since we have only 5 buffer pages, in Pass 0 its clear we need 22 sorted runs of 5 pages each.
Now, why are we doing sorted runs for remaining passes instead or merging ?
How come it tells for pass 1, 6 sorted runs of 20 pages each when we have only 5 buffer pages ?
Where exactly is merge happening here ? and how is N reducing in each pass i.e from 108 to 22 to 6 to 2 ?

External Merge Sort is necessary when you cannot store all the data into memory. The best you can do is break the data into sorted runs and merge the runs in subsequent passes. The length of a run is tied to your available buffer size.
Pass0: you are doing the operations IN PLACE. So you load 5 pages of data into the buffers and then sort it in place using an in place sorting algorithm.
These 5 pages will be stored together as a run.
Following passes: you can no longer do the operations in place since you're merging runs of many pages. 4 pages are loaded into the buffers and the 5th is the write buffer. The merging is identical to the merge sort algorithm, but you will be dividing and conquering by a factor of B-1 instead of 2. When the write buffer is filled, it is written to disk and the next page is started.
Complexity:
When analyzing the complexity of external merge sort, the number of I/Os is what is being considered. In each pass, you must read a page and write the page. Let N be the number of pages. Each run will cost 2N. Read the page, write the page.
Let B be the number of pages you can hold buffer space and N be the number of pages.
There will be ceil(log_B-1(ceil(N/B))) passes. Each pass will have 2N I/Os. So O(nlogn).
In each pass, the page length of a run is increasing by a factor of B-1, and the number of sorted runs is decreasing by a factor of B-1.
Pass0: ceil(108 / 5) = 22, 5 pages per run
Pass1: ceil(22 / 4) = 6, 20 pages per run
Pass2: ceil(6 / 4 ) = 2, 80 pages per run
Pass3: ceil(2 / 4 ) = 1 - done, 1 run of 108 pages

A. Since it NEVER mentions merging, I'd assume (hope) that the later "sorting" passes are doing merges.
B. Again, assuming this is merging, you need one buffer to save the merged records, and use one of the remaining buffers for each file being merged: thus, 4 input files, each w/ 5 pages: 20 pages.
C. Think I've answered where merge is twice now :)

external sorting

In this web page: CS302 --- External Sorting
Merge the resulting runs together into successively bigger runs, until the file is sorted.
As I quoted, how can we merge the resulting runs together??? We don't have that much memory.

Imagine you have the numbers 1 - 9
9 7 2 6 3 4 8 5 1
And let's suppose that only 3 fit in memory at a time.
So you'd break them into chunks of 3 and sort each, storing each result in a separate file:
279
346
158
Now you'd open each of the three files as streams and read the first value from each:
2 3 1
Output the lowest value 1, and get the next value from that stream, now you have:
2 3 5
Output the next lowest value 2, and continue onwards until you've outputted the entire sorted list.

If you process two runs A and B into some larger run C you can do this line-by-line generating progressively larger runs, but still only reading at most 2 lines at a time. Because the process is iterative and because you're working on streams of data rather than full cuts of data you don't need to worry about memory usage. On the other hand, disk access might make the whole process slow -- but it sure beats not being able to do the work in the first place.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio