I'm doing some theoretical examples with different page replacement algorithms, in order to get a better understanding for when I actually write the code. I'm kind of confused about this example.
Given below is a physical memory with 4 tiles (4 sections?). The following pages are visited one after the other:
R = 1, 2, 3, 2, 4, 5, 3, 6, 1, 4, 2, 3, 1, 4
Run the optimal page replacement algorithm on R with 4 tiles.
I know that when a page needs to be swapped in, the operating system swaps out the page whose next use will occur farthest in the future. In practice I'll have:
Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Page 1 2 3 2 4 5 3 6 1 4 2 3 1 4
Tile 1 1 1 1
Tile 2 2 2
Tile 3 3
Tile 4
I'm not sure what happens at time 4 because we get page 2, but thats already present in the memory. Normally, if it was another number like 6, then it would go in Tile 4 but I'm lost in this case.
At time t=4, page 2 is already present, so there is no need to do anything. You can just skip it and move to the next time interval.
If there was a another number like 6, if there is a free slot available, you move it there, or else find the page that won't be used for the longest duration in the future and swap it.
Related
I have some input data like this.
unique ID
Q1
Q2
Q3
1
1
1
2
2
1
1
2
3
1
0
3
4
2
0
1
5
3
1
2
6
4
1
3
And my target is to extract some data which satisfy the following conditions:
total count: 4
Q1=1 count: 2
Q1=2 count: 1
Q2=1 count: 1~3
Q3=1 count: 1
In this case, both data set with ids [1, 2, 4, 5] or [2, 3, 4, 5] are acceptable answers.
In reality, I will possibly have 6000+ rows of data and up to 12 count limitation like above. The count might varies from 1 to 50.
I've written a solution which firstly group all ids by each condition, then use deapth first search to exhaustedly try out all possible combinations between the groups. (I believe this is a brute-force solution...)
However, I always run out my computer's memory and my time before I can get a possible answer.
My question is,
what's the possible least time complexity of this problem. (I believe this is kind of subset sum problem, but I am not sure)
how can I solve this problem instead of a brute-force one? I'm considering dynamic programming or decision tree. However, I believe that I will possibly run out of my computer's memory with either of this one. Or can I solve this problem by each data row's probabilities/entropy (and I would appreciate more details on this)?
My brute-force solution sample codes are not worth reading at all. Thus, I'll skip posting my code snippets...
I am looking for a sorting algorithm to help me in my work. My objective is the following: after receiving an input of this kind:
5 4
1 2
2 3
3 4
4 5
The first line tells me how many ids I have, and the second number tells me how many connections. The following lines tell me the connections, and tell me that the first Id comes before the second one, for example: 1 comes before 2, 2 comes before 3, and so on. And if an impossible situation occurs:
3 2
1 2
2 3
3 1
or
2 2
1 2
2 1
I want to be able to send an error message.
Is there an algorithm that already does this? or can u give me some guide lines to how to start my work? I do not want ur code just some help/tips/advices. Thanks in advance for ur time.
From your description, I think you are probably looking for topological sorting.
It is based on the assumption that 'impossible situation' occurs when one connections suggests that A comes before B but there is some another connection which suggests that B comes before A.
Link for topological sort:
Topological Sorting
I was wondering if you had a column like
[8 8 8 8 8 1 4 4 4 1 1]'
What code could I write to find the numbers that are not repeated consecutively (non-contiguous)? In this case, what code would I have to write to find row 6? This is for big data.
--Dwight
I need to calculate a sequence of numbers (similar to Sudoku) to match teams to play each other.
I need to create a matrix for 8 and 9 teams and can't figure out the formula. I have to believe this is really simple, but I have no idea what to search for to find it.
Here is a working version for 7 teams:
team |1 2 3 4 5 6 7
====================
week 1 | 7 6 5 4 3 2
week 2 | 7 5 6 3 4 1
week 3 | 6 5 7 2 1 4
week 4 | 5 6 7 1 2 3
week 5 | 4 3 2 1 7 6
week 6 | 3 4 1 2 7 5
week 7 | 2 1 4 3 6 5
So for the first week, team 1 doesn't play (no available partner), team 2 plays team 7, team 3 plays team 6, etc.
For week 2, team 1 plays team 7, etc.
No team may play the other team. The event continues for as many weeks as we have teams, so 8 teams would play for 8 weeks.
Each team should play another team once and only once. They can't play themselves (hence the blank entry in each row.
Note that the upper right triangle is a mirror of the bottom left triangle, but that still didn't help me determine the formula.
My guess is that if I spent enough hours, I could figure out the formula. But since this has to have been done a few million times by people over the ages, I am guessing that it's a well known algorithm and I just need to find someone who knows the name (so I can look it up) or can tell me what it is so I can create this for a friend who needs it.
Thanks!
The best answer so far is from Dennis Meng (I can't comment, so I have to use an answer). That link pointed me to a question where the answer worked, sort of. I don't have an algorithym yet, but the methodology worked adequately. I have my rows and columns. It doesn't provide me with a "mirror" image the way the example does. But it does give me a unique team for each week. I am hoping that will be enough.
I just used excel to lay it out as that was faster than trying to figure out the logic, write the code, and get a nice formatted result - especially since I only seem to need to do it once.
But if it turns out I need to do it again, I will write a simple application and post it here.
Of course, it would be great if I could get the routine that generated the above matrix....
Of course, that also leads me to another issue. How can I mark Dennis' comment as the answer???? He deserves the credit (unless someone chimes in with the mirror solution....)
Oh well, thanks Dennis!
In k fold we have this:
you divide the data into k subsets of
(approximately) equal size. You train the net k times, each time leaving
out one of the subsets from training, but using only the omitted subset to
compute whatever error criterion interests you. If k equals the sample
size, this is called "leave-one-out" cross-validation. "Leave-v-out" is a
more elaborate and expensive version of cross-validation that involves
leaving out all possible subsets of v cases.
what the Term training and testing mean?I can't understand.
would you please tell me some references where I can learn this algorithm with an example?
Train classifier on folds: 2 3 4 5 6 7 8 9 10; Test against fold: 1
Train classifier on folds: 1 3 4 5 6 7 8 9 10; Test against fold: 2
Train classifier on folds: 1 2 4 5 6 7 8 9 10; Test against fold: 3
Train classifier on folds: 1 2 3 5 6 7 8 9 10; Test against fold: 4
Train classifier on folds: 1 2 3 4 6 7 8 9 10; Test against fold: 5
Train classifier on folds: 1 2 3 4 5 7 8 9 10; Test against fold: 6
Train classifier on folds: 1 2 3 4 5 6 8 9 10; Test against fold: 7
Train classifier on folds: 1 2 3 4 5 6 7 9 10; Test against fold: 8
Train classifier on folds: 1 2 3 4 5 6 7 8 10; Test against fold: 9
Train classifier on folds: 1 2 3 4 5 6 7 8 9; Test against fold: 10
In short:
Training is the process of providing feedback to the algorithm in order to adjust the predictive power of the classifier(s) it produces.
Testing is the process of determining the realistic accuracy of the classifier(s) which were produced by the algorithm. During testing, the classifier(s) are given never-before-seen instances of data to do a final confirmation that the classifier's accuracy is not drastically different from that during training.
However, you're missing a key step in the middle: the validation (which is what you're referring to in the 10-fold/k-fold cross validation).
Validation is (usually) performed after each training step and it is performed in order to help determine if the classifier is being overfitted. The validation step does not provide any feedback to the algorithm in order to adjust the classifier, but it helps determine if overfitting is occurring and it signals when the training should be terminated.
Think about the process in the following manner:
1. Train on the training data set.
2. Validate on the validation data set.
if(change in validation accuracy > 0)
3. repeat step 1 and 2
else
3. stop training
4. Test on the testing data set.
In k-fold method, you have to divide the data into k segments, k-1 of them are used for training, while one is left out and used for testing. It is done k times, first time, the first segment is used for testing, and remaining are used for training, then the second segment is used for testing, and remaining are used for training, and so on. It is clear from your example of 10 fold, so it should be simple, read again.
Now about what training is and what testing is:
Training in classification is the part where a classification model is created, using some algorithm, popular algorithms for creating training models are ID3, C4.5 etc.
Testing means to evaluate the classification model by running the model over the test data, and then creating a confusion matrix and then calculating the accuracy and error rate of the model.
In K-fold method, k models are created (as clear from the description above) and the most accurate model for classification is the selected.