Least Frequently Used (LFU) cache tracing

Least Frequently Used (LFU) cache tracing - algorithm

I'm wondering if I've answered this question right:
The page references are in this sequence: *ABCBADACEBEFBEFBA
With LFU page replacement, how many page faults would occur?
SLOT
A
B
C
B
A
D
A
C
E
B
E
F
B
E
F
B
A
1
A
x
x
x
2
B
x
x
x
x
3
C
D
C
E
x
F
E
F
From the tracing I've done. I've come to the conclusion that there are 9 page faults. I count the frequency of each time a page is used and reset it to 0 whenever they are removed from their slot (swapped out). Is this the right way to do this?
SLOT
A
B
C
B
A
D
A
C
E
B
E
F
B
E
F
B
A
1
A
x
x
x
2
B
x
C
B
F
B
F
B
3
C
D
E
x
x
The solution I've been given is like this that gives us 11 page faults. However, I can't understand why the second C would be replaced on slot 2 when the frequency of B is 2 and the frequency of D in slot 3 is only 1.

You should go back to the definition of LFU that your were given in class. It seems that you interpret it as
evict the entry with the least number of hits since it was populated.
in which case your answer (first table) is indeed correct.
However, it seems that the LFU policy used in the expected answer (second table) is
evict the entry with the smallest ratio of freq(X) = number of hits / its age.
In such a case, at the 2nd C, you have
freq(A) = 3/7 = 0.429
freq(B) = 2/6 = 0.333
freq(D) = 1/2 = 0.500
and the entry with the least frequency is, indeed, B.
I'd expect LFU to implement the 2nd strategy, because once you have entries with different ages in your cache, you have to account for them having less or more time to accumulate statistics. Your approach would give correct frequencies only in the limit if the entries are never evicted -- which is not a practically interesting case.

Related

Search - find closest node to n different start points (uniform cost)

Suppose I had a node path where the cost of travelling between each node is uniform. I'm trying to find the closest node that 2 or more nodes can travel to. Closest being measured as the cumulative cost of reaching the common node from all start points.
If I wanted to find the closest common node to node A and B, that node would be E.
A -> E (2 cost)
B -> E (1 cost)
If I wanted to find the closest common node to node A, B, C, that node would be F.
A -> F (3 cost)
B -> F (2 cost)
C -> F (1 cost)
And if I wanted to find the closest common node between node G, E, no node is possible.
So there should be two outputs: either the closest node or an error message stating that it cannot reach one another.
I would appreciate if I could be given a algorithm that can achieve this. A link to a article, psudocode or any language is fine, below is some python code that represents the graph above in a defaultdict(list) object.
from enum import Enum
from collections import defaultdict
class Type(Enum):
A = 1
B = 2
C = 3
D = 4
E = 5
F = 6
G = 6
paths = defaultdict(list)
paths[Type.A].append(Type.D)
paths[Type.D].append(Type.G)
paths[Type.D].append(Type.E)
paths[Type.B].append(Type.E)
paths[Type.E].append(Type.F)
paths[Type.C].append(Type.F)
Thanks in advance.

Thanks to #VincentvanderWeele for the suggestion:
Example cost of all nodes from A, B
A B C D E F G
___________________
A 0 X X 1 2 3 2
B X 1 X X 2 2 X
As an optimisation when working out the 2nd+ node you can skip any nodes that the previous nodes can not travel to, e.g.
A B C D E F G
___________________
A 0 X X 1 2 3 2
B X X X X 2 2 X
^
Possible closest nodes:
E = 2 + 2 = 4
F = 2 + 3 = 5
Result is E since it has the lowest cost

Grouping connected pairs of values

I have a list containing unique pairs of values x and y; for example:
x y
-- --
1 A
2 A
3 A
4 B
5 A
5 C
6 D
7 D
8 C
8 E
9 B
9 F
10 C
10 G
I want to divide this list of pairs as follows:
Group 1
1 A
2 A
3 A
5 A
5 C
8 C
10 C
8 E
10 G
Group 2
4 B
9 B
9 F
Group 3
6 D
7 D
Group 1 contains
all pairs where y = 'A' (1-A, 2-A, 3-A, 5-A)
any additional pairs where x = any of the x's above (5-C)
any additional pairs where y = any of the y's above (8-C, 10-C)
any additional pairs where x = any of the x's above (8-E, 10-G)
The pairs in Group 2 can't be reached in such a manner from any pairs in Group 1, nor can the pairs in Group 3 be reached from either Group 1 or Group 2.
As suggested in Group 1, the chain of connections can be arbitrarily long.
I'm exploring solutions using Perl, but any sort of algorithm, including pseudocode, would be fine. For simplicity, assume that all of the data can fit in data structures in memory.
[UPDATE] Because I need to apply this approach to 5.3 billion pairs, scaleability is important to me.

Pick a starting point. Find all points reachable from that, removing from the master list. Repeat for all added points, until no more can be reached. Move to the next group, starting with another remaining point. Continue until you have no more remaining points.
pool = [(1 A), (2 A), (3 A), (4 B), ... (10 G)]
group_list = []
group = []
pos = 0
while pool is not empty
group = [ pool[0] ] # start with next available point
pos = -1
while pos+1 < size(group) // while there are new points in the group
pos += 1
group_point = group[pos] // grab next available point
for point in pool // find all remaining points reachable
if point and group_point have a coordinate in common
remove point from pool
add point to group
// we've reached closure with that starting point
add group to group_list
return group_list

You can think of the letters and numbers as nodes of a graph, and the pairs as edges. Divide this graph into connected components in linear time.
The connected component with 'A' forms group 1. The other connected components form the other groups.

Ilustrate the left-most derivation on a token stream

I am trying to understand the left-most derivation in the context of LL parsing algorithm. This link explains it from the generative perspective. i.e. It shows how to follow left-most derivation to generate a specific token sequence from a set of rules.
But I am thinking about the opposite direction. Given a token stream and a set of grammar rules, how to find the proper steps to apply a set of rules by the left-most derivation?
Let's continue to use the following grammar from the aforementioned link:
And the given token sequence is: 1 2 3
One way is this:
1 2 3
-> D D D
-> N D D (rewrite the *left-most* D to N according to the rule N->D.)
-> N D (rewrite the *left-most* N D to N according to the rule N->N D.)
-> N (same as above.)
But there are other ways to apply the grammar rules:
1 2 3 -> D D D -> N D D -> N N D -> N N N
OR
1 2 3 -> D D D -> N D D -> N N D -> N N
But only the first derivation ends up in a single non-terminal.
As the token sequence length increase, there can be many more ways. I think to infer a proper deriving steps, 2 prerequisites are needed:
a starting/root rule
the token sequence
After giving these 2, what's the algorithm to find the deriving steps? Do we have to make the final result a single non-terminal?

The general process of LL parsing consists of repeatedly:
Predict the production for the top grammar symbol on the stack, if that symbol is a non-terminal, and replace that symbol with the right-hand side of the production.
Match the top grammar symbol on the stack with the next input symbol, discarding both of them.
The match action is unproblematic but the prediction might require an oracle. However, for the purposes of this explanation, the mechanism by which the prediction is made is irrelevant, provided that it works. For example, it might be that for some small integer k, every possible sequence of k input symbols is only consistent with at most one possible production, in which case you could use a look-up table. In that case, we say that the grammar is LL(k). But you could use any mechanism, including magic. It is only necessary that the prediction always be accurate.
At any step in this algorithm, the partially-derived string is the consumed input appended with the stack. Initially there is no consumed input and the stack consists solely of the start symbol, so that the the partially-derived string (which has had 0 derivations applied). Since the consumed input consists solely of terminals and the algorithm only ever modifies the top (first) element of the stack, it is clear that the series of partially-derived strings constitutes a leftmost derivation.
If the parse is successful, the entire input will be consumed and the stack will be empty, so the parse results in a leftmost derivation of the input from the start symbol.
Here's the complete parse for your example:
Consumed Unconsumed Partial Production
Input Stack input derivation or other action
-------- ----- ---------- ---------- ---------------
N 1 2 3 N N → N D
N D 1 2 3 N D N → N D
N D D 1 2 3 N D D N → D
D D D 1 2 3 D D D D → 1
1 D D 1 2 3 1 D D -- match --
1 D D 2 3 1 D D D → 2
1 2 D 2 3 1 2 D -- match --
1 2 D 3 1 2 D D → 3
1 2 3 3 1 2 3 -- match --
1 2 3 -- -- 1 2 3 -- success --
If you read the last two columns, you can see the derivation process starting from N and ending with 1 2 3. In this example, the prediction can only be made using magic because the rule N → N D is not LL(k) for any k; using the right-recursive rule N → D N instead would allow an LL(2) decision procedure (for example,"use N → D N if there are at least two unconsumed input tokens; otherwise N → D".)
The chart you are trying to produce, which starts with 1 2 3 and ends with N is a bottom-up parse. Bottom-up parses using the LR algorithm correspond to rightmost derivations, but the derivation needs to be read backwards, since it ends with the start symbol.

calculating odds overtime during the day

I've a little problem to resolve and by going by trial and error i'm not being able to resolve it.
I have five variables:
a = 100
b = 7
c = 0
d = 24 * 60 * 60 // total seconds in one day
e = NN // seconds in the day so far, it's variable
I use this to reduce #a# (100) over time during the day, depending of #e#. That's pretty easy and it's currently done.
Now i need to introduce #b# and #c#; when #b# never changes; #c# does and can be 0 to 6, always staring from zero and increasing up to 7 over the day(its a random add at any time)
Here is where i struggle, so to explain myself a bit more:
Currently when: e = d/2 (halfway through day) the result will be: (a - 50).
With the changes, if #c# is zero the result should be the same (50 in my example).
But when #c# isn't zero => the result should be 50+someMagicNumber. Why? because #c# is closer tho reach the limit (#b#)
I'm not great at math, and also I'm writing this may help clear my mind :); if anyone understands what I'm trying to do and have any idea will appreciate; also excuse my English.
ps: once #c# reaches 7 the whole thing is done and this calculation isn't done. so #c# will always be < #b#

It appears that the expression you're trying to evaluate is a * (c + e/d). You say that a and d are constants (cannot vary) and c and e are variables. By the way, d is 86400 (the value of 24*60*60).
Let's just consider the cases where c is zero, first. If c is 0 and e is 0 then your result is 0. If c is 0 and e is 1, the result is 100/d, which is 1/864. If c is 0 and e is 2, the result is 2/864, and so forth. Each time we increase e by 1, the result increases by 1/864.
If c is 0 and e is d - 1, the result is 100 - 1/864. This is the last second of the first day of your week. For the next case, instead of setting e == d, you should say c is 1 and e is 0. And your result should increase by 1/864 again, so the result is 100 exactly.
So by going from c==0, e==0 to the case c==1, e==0, you increased your result by 100. And each time you add 1 to c you will increase your result by 100 again.

CodeGolf: Brothers

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I just finished participating in the 2009 ACM ICPC Programming Conest in the Latinamerican Finals. These questions were for Brazil, Bolivia, Chile, etc.
My team and I could only finish two questions out of the eleven (not bad I think for the first try).
Here's one we could finish. I'm curious to seeing any variations to the code. The question in full: ps: These questions can also be found on the official ICPC website available to everyone.
In the land of ACM ruled a greeat king who became obsessed with order. The kingdom had a rectangular form, and the king divided the territory into a grid of small rectangular counties. Before dying the king distributed the counties among his sons.
The king was unaware of the rivalries between his sons: The first heir hated the second but not the rest, the second hated the third but not the rest, and so on...Finally, the last heir hated the first heir, but not the other heirs.
As soon as the king died, the strange rivaly among the King's sons sparked off a generalized war in the kingdom. Attacks only took place between pairs of adjacent counties (adjacent counties are those that share one vertical or horizontal border). A county X attacked an adjacent county Y whenever X hated Y. The attacked county was always conquered. All attacks where carried out simultanously and a set of simultanous attacks was called a battle. After a certain number of battles, the surviving sons made a truce and never battled again.
For example if the king had three sons, named 0, 1 and 2, the figure below shows what happens in the first battle for a given initial land distribution:
INPUT
The input contains several test cases. The first line of a test case contains four integers, N, R, C and K.
N - The number of heirs (2 <= N <= 100)
R and C - The dimensions of the land. (2 <= R,C <= 100)
K - Number of battles that are going to take place. (1 <= K <= 100)
Heirs are identified by sequential integers starting from zero. Each of the next R lines contains C integers HeirIdentificationNumber (saying what heir owns this land) separated by single spaces. This is to layout the initial land.
The last test case is a line separated by four zeroes separated by single spaces. (To exit the program so to speak)
Output
For each test case your program must print R lines with C integers each, separated by single spaces in the same format as the input, representing the land distribution after all battles.
Sample Input: Sample Output:
3 4 4 3 2 2 2 0
0 1 2 0 2 1 0 1
1 0 2 0 2 2 2 0
0 1 2 0 0 2 0 0
0 1 2 2
Another example:
Sample Input: Sample Output:
4 2 3 4 1 0 3
1 0 3 2 1 2
2 1 2

Perl, 233 char
{$_=<>;($~,$R,$C,$K)=split;if($~){#A=map{$_=<>;split}1..$R;$x=0,
#A=map{$r=0;for$d(-$C,$C,1,-1){$r|=($y=$x+$d)>=0&$y<#A&1==($_-$A[$y])%$~
if($p=(1+$x)%$C)>1||1-$d-2*$p}$x++;($_-$r)%$~}#A
while$K--;print"#a\n"while#a=splice#A,0,$C;redo}}
The map is held in a one-dimensional array. This is less elegant than the two-dimensional solution, but it is also shorter. Contains the idiom #A=map{...}#A where all the fighting goes on inside the braces.

Python (420 characters)
I haven't played with code golf puzzles in a while, so I'm sure I missed a few things:
import sys
H,R,C,B=map(int,raw_input().split())
M=(1,0), (0,1),(-1, 0),(0,-1)
l=[map(int,r.split())for r in sys.stdin]
n=[r[:]for r in l[:]]
def D(r,c):
x=l[r][c]
a=[l[r+mr][c+mc]for mr,mc in M if 0<=r+mr<R and 0<=c+mc<C]
if x==0and H-1in a:n[r][c]=H-1
elif x-1in a:n[r][c]=x-1
else:n[r][c]=x
G=range
for i in G(B):
for r in G(R):
for c in G(C):D(r,c)
l=[r[:] for r in n[:]]
for r in l:print' '.join(map(str,r))

Lua, 291 Characters
g=loadstring("return io.read('*n')")repeat n=g()r=g()c=g()k=g()l={}c=c+1 for
i=0,k do w={}for x=1,r*c do a=l[x]and(l[x]+n-1)%n w[x]=i==0 and x%c~=0 and
g()or(l[x-1]==a or l[x+1]==a or l[x+c]==a or l[x-c]==a)and a or
l[x]io.write(i~=k and""or x%c==0 and"\n"or w[x].." ")end l=w end until n==0

F#, 675 chars
let R()=System.Console.ReadLine().Split([|' '|])|>Array.map int
let B(a:int[][]) r c g=
let n=Array.init r (fun i->Array.copy a.[i])
for i in 1..r-2 do for j in 1..c-2 do
let e=a.[i].[j]-1
let e=if -1=e then g else e
if a.[i-1].[j]=e||a.[i+1].[j]=e||a.[i].[j-1]=e||a.[i].[j+1]=e then
n.[i].[j]<-e
n
let mutable n,r,c,k=0,0,0,0
while(n,r,c,k)<>(0,2,2,0)do
let i=R()
n<-i.[0]
r<-i.[1]+2
c<-i.[2]+2
k<-i.[3]
let mutable a=Array.init r (fun i->
if i=0||i=r-1 then Array.create c -2 else[|yield -2;yield!R();yield -2|])
for j in 1..k do a<-B a r c (n-1)
for i in 1..r-2 do
for j in 1..c-2 do
printf "%d" a.[i].[j]
printfn ""
Make the array big enough to put an extra border of "-2" around the outside - this way can look left/up/right/down without worrying about out-of-bounds exceptions.
B() is the battle function; it clones the array-of-arrays and computes the next layout. For each square, see if up/down/left/right is the guy who hates you (enemy 'e'), if so, he takes you over.
The main while loop just reads input, runs k iterations of battle, and prints output as per the spec.
Input:
3 4 4 3
0 1 2 0
1 0 2 0
0 1 2 0
0 1 2 2
4 2 3 4
1 0 3
2 1 2
0 0 0 0
Output:
2220
2101
2220
0200
103
212

Python 2.6, 383 376 Characters
This code is inspired by Steve Losh' answer:
import sys
A=range
l=lambda:map(int,raw_input().split())
def x(N,R,C,K):
if not N:return
m=[l()for _ in A(R)];n=[r[:]for r in m]
def u(r,c):z=m[r][c];n[r][c]=(z-((z-1)%N in[m[r+s][c+d]for s,d in(-1,0),(1,0),(0,-1),(0,1)if 0<=r+s<R and 0<=c+d<C]))%N
for i in A(K):[u(r,c)for r in A(R)for c in A(C)];m=[r[:]for r in n]
for r in m:print' '.join(map(str,r))
x(*l())
x(*l())
Haskell (GHC 6.8.2), 570 446 415 413 388 Characters
Minimized:
import Monad
import Array
import List
f=map
d=getLine>>=return.f read.words
h m k=k//(f(\(a#(i,j),e)->(a,maybe e id(find(==mod(e-1)m)$f(k!)$filter(inRange$bounds k)[(i-1,j),(i+1,j),(i,j-1),(i,j+1)])))$assocs k)
main=do[n,r,c,k]<-d;when(n>0)$do g<-mapM(const d)[1..r];mapM_(\i->putStrLn$unwords$take c$drop(i*c)$f show$elems$(iterate(h n)$listArray((1,1),(r,c))$concat g)!!k)[0..r-1];main
The code above is based on the (hopefully readable) version below. Perhaps the most significant difference with sth's answer is that this code uses Data.Array.IArray instead of nested lists.
import Control.Monad
import Data.Array.IArray
import Data.List
type Index = (Int, Int)
type Heir = Int
type Kingdom = Array Index Heir
-- Given the dimensions of a kingdom and a county, return its neighbors.
neighbors :: (Index, Index) -> Index -> [Index]
neighbors dim (i, j) =
filter (inRange dim) [(i - 1, j), (i + 1, j), (i, j - 1), (i, j + 1)]
-- Given the first non-Heir and a Kingdom, calculate the next iteration.
iter :: Heir -> Kingdom -> Kingdom
iter m k = k // (
map (\(i, e) -> (i, maybe e id (find (== mod (e - 1) m) $
map (k !) $ neighbors (bounds k) i))) $
assocs k)
-- Read a line integers from stdin.
readLine :: IO [Int]
readLine = getLine >>= return . map read . words
-- Print the given kingdom, assuming the specified number of rows and columns.
printKingdom :: Int -> Int -> Kingdom -> IO ()
printKingdom r c k =
mapM_ (\i -> putStrLn $ unwords $ take c $ drop (i * c) $ map show $ elems k)
[0..r-1]
main :: IO ()
main = do
[n, r, c, k] <- readLine -- read number of heirs, rows, columns and iters
when (n > 0) $ do -- observe that 0 heirs implies [0, 0, 0, 0]
g <- sequence $ replicate r readLine -- read initial state of the kingdom
printKingdom r c $ -- print kingdom after k iterations
(iterate (iter n) $ listArray ((1, 1), (r, c)) $ concat g) !! k
main -- handle next test case

AWK - 245
A bit late, but nonetheless... Data in a 1-D array. Using a 2-D array the solution is about 30 chars longer.
NR<2{N=$1;R=$2;C=$3;K=$4;M=0}NR>1{for(i=0;i++<NF;)X[M++]=$i}END{for(k=0;k++<K;){
for(i=0;i<M;){Y[i++]=X[i-(i%C>0)]-(b=(N-1+X[i])%N)&&X[i+((i+1)%C>0)]-b&&X[i-C]-b
&&[i+C]-b?X[i]:b}for(i in Y)X[i]=Y[i]}for(i=0;i<M;)printf"%s%d",i%C?" ":"\n",
X[i++]}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio