Okay, this is like the 5th time I have had to ask this question, and still nobody has been able to give me an answer or solution. But here we go again ...
I want to run a very simple little MATLAB program. All it does is RANDOMLY display images from a directory. Here is my code:
files = dir(fullfile(matlabroot,'toolbox','semjudge',bpic,'*.png'));
nFiles = numel(files);
combos = nchoosek(1:nFiles, 2);
index = combos(randperm(size(combos, 1)), :);
picture1 = files(index(nRep,1)).name;
picture2 = files(index(nRep,2)).name;
image1 = fullfile(matlabroot,'toolbox','semjudge',bpic,picture1);
image2 = fullfile(matlabroot,'toolbox','semjudge',bpic,picture2);
subplot(1,2,1); imshow(image1);
subplot(1,2,2); imshow(image2);
I have tried several different iterations of this, including replacing "nchoosek" with "randsample."
But it doesn't work! Every time I run the program, the script runs the same image files in the same order. Why is it doing this? It's like it randomized the image files the first time I ran it, but now it only runs them in that order, instead of randomizing them every time the script is run.
Can somebody please help me with this?
The pseudo-random number generator starts off from a specific seed. The "random" numbers provided are deterministic. You need to change the seed to change these numbers.
The benefit of this is that even if you use pseudo-randomness in your algorithm, you can always replay a run by using the same seed again.
Reference: http://www.mathworks.de/help/techdoc/ref/rng.html
As an elaboration of #ypnos's answer, you probably want to add a line like this:
rng('shuffle');
To the start of your code. That will seed the random number generator with a value based on the time, and then you should get a different sequence of random numbers.
Related
I'm trying to calculate the Kolmogorov-Smirnov statistic in R. I have the following sample, which clearly comes from a random variable that follows a long-tailed distribution.
Download link
https://drive.google.com/file/d/1hIgqikX7p343zdyc-Goq34THUpsZA63n/view?usp=sharing
As you may know, the Kolmogorov-Smirnov statistic requires the calculation of the empirical cumulative distribution function and the presumed cumulative distribution function. For both calculations I take the following approach: first, I create a vector with the same length as the length of the sample, and then I modify each of the components of the vector so as for it to contain the empirical cdf (or presumed cdf) of the corresponding observation of the sample.
For the sake of illustration, I'll show you the code I wrote in order to calculate the empirical cdf.
I'm assuming that the data has been read and stored in a dataframe called data.
ecdf = vector("numeric", length(data$logueos))for (i in 1:length(data$logueos)) {ecdf[i] = sum (data$logueos <= data$logueos[i])/length(data$logueos)}
The code I wrote for the calculation of the presumed cdf is analogous to the preceding one; the only difference is that I set each component of the pcdf vector equal to the formula $P(X<=t)$ —where t is the corresponding observation of the sample— according to the distribution that I'm assuming.
The problem is that this 'for' loop never ends. If I force it to end by clicking RStudio's stop button it works: it makes the vector store what I want it to store. But, if I press Ctrl+Shift+k in order to render my notebook and preview it, the load gets stuck when trying to execute the first chunk encountered that contains one of those loops.
First of all, your loop is not endless. It will finish, eventually.
You start initializing a vector with as much elements as the number of observations (1.245.888, which is a lot of iterations). This vector is FULL OF ZEROS.
What your loop does is iterate while changing each zero with the calculus sum (data$logueos <= data$logueos[i])/length(data$logueos). Check that when you stop the execution, the first values of your vector will be values between 0 and 1 while the last values is going to be 0s (because the loop hasn't arrived there yet).
So, you will have to wait more time.
In order to make the execution faster, you could consider loop parallelization (because standard loops go sequentially, one by one, and if it's too much wait, parallelization makes it faster. For example, executing 4 by 4, depending of your computer capacities). Here you'll find some information about it: https://nceas.github.io/oss-lessons/parallel-computing-in-r/parallel-computing-in-r.html
Then, my proposal to you:
if(!require(foreach)){install.packages("foreach")}; require(foreach)
registerDoParallel(detectCores() - 1)
ecdf = vector("numeric", length(data$logueos))
foreach (i=1:length(data$logueos)) %do% {
print(i)
ecdf[i] = sum (data$logueos <= data$logueos[i])/length(data$logueos)
}
The first line will download and load foreach library, that you
need for parallelization.
detectCores() - 1 is going to use all the
processors that your computer has except one (to avoid freezing your
machine) for computing this loop. You'll see that is going to be
faster!
registerDoParallel function is what tells to foreach how many cores use.
I have a large, rather complicated procedural content generation lua project. One thing I want to be able to do, for debugging purposes, is use a random seed so that I can re-run the system & get the same results.
To the end, I print out the seed at the start of a run. The problem is, I still get completely different results each time I run it. Assuming the seed doesn't change anywhere else, this shouldn't be possible, right?
My question is, what other ways are there to influence the output of lua's math.random()? I've searched through all the code in the project, and there's only one place where I call math.randomseed(), and I do that before I do anything else. I don't use the time or date for any calculations, so that wouldn't be influencing the results... What else could I be missing?
Updated on 2/22/16 monkey patching math.random & math.randomseed has, oftentimes (but not always) output the same sequence of random numbers. But still not the same results – so I guess the real question is now: what behavior in lua is indeterminate, and could result in different output when the same code is run in sequence? Noting where it diverges, when it does, is helping me narrow it down, but I still haven't found it. (this code does NOT use coroutines, so I don't think it's a threading / race condition issue)
randomseed is using srandom/srand function, which "sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random()".
I can offer several possible explanations:
you think you call randomseed, but you do not (random will initialize the sequence for you in this case).
you think you call randomseed once, but you call it multiple times (or some other part of the code calls randomseed as well, possibly at different times in your sequence).
some other part of the code calls random (some number of times), which generates different results for your part of the code.
there is nothing wrong with the generated sequence, but you are misinterpreting the results.
your version of Lua has a bug in srandom/random processing.
there is something wrong with srandom or random function in your system.
Having some information about your version of Lua and your system (in addition to the small example demonstrating the issue) would help in figuring out what's causing this.
Updated on 2016/2/22: It should be fairly easy to check; monkeypatch both math.randomseed and math.random and log all the calls and the values returned by the functions for two subsequent runs. Compare the results. If the results differ, you should be able to isolate why they differ and reproduce on a smaller example. You can also look at where the functions are called from using debug.traceback.
Correct, as stated in the documentation, 'equal seeds produce equal sequences of numbers.'
Immediately after setting the seed to a known constant value, output a call to rand - if this varies across runs, you know something is seriously wrong (corrupt library download, whack install, gamma ray hit your drive, etc).
Assuming that the first value matches across runs, add another output midway through the code. From there, you can use a binary search to zero in on where things go wrong (I.E. first half or second half of the code block in question).
While you can & should use some intuition to find the error as you go, keep in mind that if intuition alone was enough, you would have already found it, thus a bit of systematic elimination is warranted.
Revision to cover comment regarding array order:
If possible, use debugging tools. This SO post on detecting when the value of a Lua variable changes might help.
In the absence of tools, here's one way to roll your own for this problem:
A full debugging dump of any sizable array quickly becomes a mess that makes it tough to spot changes. Instead, I'd use a few extra variables & a test function to keep things concise.
Make two deep copies of the array. Let's call them debug01 & debug02 & call the original array original. Next, deliberately swap the order of two elements in debug02.
Next, build a function to compare two arrays & test if their elements match up & return / print the index of the first mismatch if they do not. Immediately after initializing the arrays, test them to ensure:
original & debug01 match
original & debug02 do not match
original & debug02 mismatch where you changed them
I cannot stress enough the insanity of using an unverified (and thus, potentially bugged) test function to track down bugs.
Once you've verified the function works, you can again use a binary search to zero in on where things go off the rails. As before, balance the use of a systematic search with your intuition.
I am running octave 3.8.1. Even if I set the seed for the random number generator, poissrnd always produce a different number. Let us consider the following code, for example
for i=1:2
rand('state',1); randn('state',1);
poissrnd(10)
end
Running it in matlab, produce the same number in both iterations. Running it in Octave, always produce a different number.
How can I correctly set a seed to poissrnd?
Thank you
Ok, I found the solution. You have to use randp('state',1). Therefore, the script
for i=1:2
randp('state',1);
poissrnd(10)
end
would always produce the same numbers.
Suppose that I have these Three variables in matlab Variables
I want to extract diverse values in NewGrayLevels and sum rows of OldHistogram that are in the same rows as one diverse value is.
For example you see in NewGrayLevels that the six first rows are equal to zero. It means that 0 in the NewGrayLevels has taken its value from (0 1 2 3 4 5) of OldGrayLevels. So the corresponding rows in OldHistogram should be summed.
So 0+2+12+38+113+163=328 would be the frequency of the gray level 0 in the equalized histogram and so on.
Those who are familiar with image processing know that it's part of the histogram equalization algorithm.
Note that I don't want to use built-in function "histeq" available in image processing toolbox and I want to implement it myself.
I know how to write the algorithm with for loops. I'm seeking if there is a faster way without using for loops.
The code using for loops:
for k=0:255
Condition = NewGrayLevels==k;
ConditionMultiplied = Condition.*OldHistogram;
NewHistogram(k+1,1) = sum(ConditionMultiplied);
end
I'm afraid if this code gets slow for high resolution big images.Because the variables that I have uploaded are for a small image downloaded from the internet but my code may be used for sattellite images.
I know you say you don't want to use histeq, but it might be worth your time to look at the MATLAB source file to see how the developers wrote it and copy the parts of their code that you would like to implement. Just do edit('histeq') or edit('histeq.m'), I forget which.
Usually the MATLAB code is vectorized where possible and runs pretty quick. This could save you from having to reinvent the entire wheel, just the parts you want to change.
I can't think a way to implement this without a for loop somewhere, but one optimisation you could make would be using indexing instead of multiplication:
for k=0:255
Condition = NewGrayLevels==k; % These act as logical indices to OldHistogram
NewHistogram(k+1,1) = sum(OldHistogram(Condition)); % Removes a vector multiplication, some additions, and an index-to-double conversion
end
Edit:
On rereading your initial post, I think that the way to do this without a for loop is to use accumarray (I find this a difficult function to understand, so read the documentation and search online and on here for examples to do so):
NewHistogram = accumarray(1+NewGrayLevels,OldHistogram);
This should work so long as your maximum value in NewGrayLevels (+1 because you are starting at zero) is equal to the length of OldHistogram.
Well I understood that there's no need to write the code that #Hugh Nolan suggested. See the explanation here:
%The green lines are because after writing the code, I understood that
%there's no need to calculate the equalized histogram in
%"HistogramEqualization" function and after gaining the equalized image
%matrix you can pass it to the "ExtractHistogram" function
% (which there's no loops in it) to acquire the
%equalized histogram.
%But I didn't delete those lines of code because I had tried a lot to
%understand the algorithm and write them.
For more information and studying the code, please see my next question.
I have some code which delivers things based on weighted random. Things with more weight are more likely to be randomly chosen. Now being a good rubyist I of couse want to cover all this code with tests. And I want to test that things are getting fetched according the correct probabilities.
So how do I test this? Creating tests for something that should be random make it very hard to compare actual vs expected. A few ideas I have, and why they wont work great:
Stub Kernel.rand in my tests to return fixed values. This is cool, but rand() gets called multiple times and I'm not sure I can rig this with enough control to test what I need to.
Fetch a random item a HUGE number of times and compare the actual ratio vs the expected ratio. But unless I can run it an infinite number of times, this will never be perfect and could intermittently fail if I get some bad luck in the RNG.
Use a consistent random seed. This makes the RNG repeatable but it still doesn't give me any verification that item A will happen 80% of the time (for example).
So what kind of approach can I use to write test coverage for random probabilities?
I think you should separate your goals. One is to stub Kernel.rand as you mention. With rspec for example, you can do something like this:
test_values = [1, 2, 3]
Kernel.stub!(:rand).and_return( *test_values )
Note that this stub won't work unless you call rand with Kernel as the receiver. If you just call "rand" then the current "self" will receive the message, and you'll actually get a random number instead of the test_values.
The second goal is to do something like a field test where you actually generate random numbers. You'd then use some kind of tolerance to ensure you get close to the desired percentage. This is never going to be perfect though, and will probably need a human to evaluate the results. But it still is useful to do because you might realize that another random number generator might be better, like reading from /dev/random. Also, it's good to have this kind of test because let's say you decide to migrate to a new kind of platform whose system libraries aren't as good at generating randomness, or there's some bug in a certain version. The test could be a warning sign.
It really depends on your goals. Do you only want to test your weighting algorithm, or also the randomness?
It's best to stub Kernel.rand to return fixed values.
Kernel.rand is not your code. You should assume it works, rather than trying to write tests that test it rather than your code. And using a fixed set of values that you've chosen and explicitly coded in is better than adding a dependency on what rand produces for a specific seed.
If you wanna go down the consistent seed route, look at Kernel#srand:
http://www.ruby-doc.org/core/classes/Kernel.html#M001387
To quote the docs (emphasis added):
Seeds the pseudorandom number
generator to the value of number. If
number is omitted or zero, seeds the
generator using a combination of the
time, the process id, and a sequence
number. (This is also the behavior if
Kernel::rand is called without
previously calling srand, but without
the sequence.) By setting the seed
to a known value, scripts can be made
deterministic during testing. The
previous seed value is returned. Also
see Kernel::rand.
For testing, stub Kernel.rand with the following simple but perfectly reasonable LCPRNG:
##q = 0
def r
##q = 1_103_515_245 * ##q + 12_345 & 0xffff_ffff
(##q >> 2) / 0x3fff_ffff.to_f
end
You might want to skip the division and use the integer result directly if your code is compatible, as all bits of the result would then be repeatable instead of just "most of them". This isolates your test from "improvements" to Kernel.rand and should allow you to test your distribution curve.
My suggestion: Combine #2 and #3. Set a random seed, then run your tests a very large number of times.
I do not like #1, because it means your test is super-tightly coupled to your implementation. If you change how you are using the output of rand(), the test will break, even if the result is correct. The point of a unit test is that you can refactor the method and rely on the test to verify that it still works.
Option #3, by itself, has the same problem as #1. If you change how you use rand(), you will get different results.
Option #2 is the only way to have a true black box solution that does not rely on knowing your internals. If you run it a sufficiently high number of times, the chance of random failure is negligible. (You can dig up a stats teacher to help you calculate "sufficiently high," or you can just pick a really big number.)
But if you're hyper-picky and "negligible" isn't good enough, a combination of #2 and #3 will ensure that once the test starts passing, it will keep passing. Even that negligible risk of failure only crops up when you touch the code under test; as long as you leave the code alone, you are guaranteed that the test will always work correctly.
Pretty often when I need predictable results from something that is derived from a random number I usually want control of the RNG, which means that the easiest is to make it injectable. Although overriding/stubbing rand can be done, Ruby provides a fine way to pass your code a RNG that is seeded with some value:
def compute_random_based_value(input_value, random: Random.new)
# ....
end
and then inject a Random object I make on the spot in the test, with a known seed:
rng = Random.new(782199) # Scientific dice roll
compute_random_based_value(your_input, random: rng)