Parallel loops and Random produce odd results - random

I just started playing with the Task Parallel Library, and ran into interesting issues; I have a general idea of what is going on, but would like to hear comments from people more competent than me to help understand what is happening. My apologies for the somewhat lengthy code.
I started with a non-parallel simulation of a random walk:
var random = new Random();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var simulations = new List<int>();
for (var run = 0; run < 20; run++)
{
var position = 0;
for (var step = 0; step < 10000000; step++)
{
if (random.Next(0, 2) == 0)
{
position--;
}
else
{
position++;
}
}
Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
simulations.Add(position);
}
Console.WriteLine(string.Format("Average position: {0} .", simulations.Average()));
stopwatch.Stop();
Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));
Console.ReadLine();
I then wrote my first attempt at a parallel loop:
var localRandom = new Random();
stopwatch.Reset();
stopwatch.Start();
var parallelSimulations = new List<int>();
Parallel.For(0, 20, run =>
{
var position = 0;
for (var step = 0; step < 10000000; step++)
{
if (localRandom.Next(0, 2) == 0)
{
position--;
}
else
{
position++;
}
}
Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
parallelSimulations.Add(position);
});
Console.WriteLine(string.Format("Average position: {0} .", parallelSimulations.Average()));
stopwatch.Stop();
Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));
Console.ReadLine();
When I ran it on a virtual machine set to use 1 core only, I observed a similar duration, but the runs are no longer processed in order - no surprise.
When I ran it on a dual-core machine, things went odd. I saw no improvement in time, and observed some very weird results for each run. Most runs end up with results of -1,000,000, (or very close), which indicates that Random.Next is returning 0 quasi all the time.
When I make the random local to each loop, everything works just fine, and I get the expected duration improvement:
Parallel.For(0, 20, run =>
{
var localRandom = new Random();
var position = 0;
My guess is that the problem has to do with the fact that the Random object is shared between the loops, and has some state. The lack of improvement in duration in the "failing parallel" version is I assume due to that fact that the calls to Random are not processed in parallel (even though I see that the parallel version uses both cores, whereas the original doesn't). The piece I really don't get is why the simulation results are what they are.
One separate worry I have is that if I use Random instances local to each loop, I may run into the problem of having multiple loops starting with the same seed (the issue you get when you generate multiple Randoms too close in time, resulting in identical sequences).
Any insight in what is going on would be very valuable to me!

Neither of these approaches will give you really good random numbers.
This blog post covers a lot of approaches for getting better random numbers with Random
Link
These may be fine for many day to day applications.
However if you use the same random number generator on multiple threads even with different seeds you will still impact the quality of your random numbers. This is because you are generating sequences of pseudo-random numbers which may overlap.
This video explains why in a bit more detail:
http://software.intel.com/en-us/videos/tim-mattson-use-and-abuse-of-random-numbers/
If you want really random numbers then you really need to use the crypto random number generator System.Security.Cryptography.RNGCryptoServiceProvider. This is threadsafe.

The Random class is not thread-safe; if you use it on multiple threads, it can get messed up.
You should make a separate Random instance on each thread, and make sure that they don't end up using the same seed. (eg, Environment.TickCount * Thread.CurrentThread.ManagedThreadId)

One core problem:
random.Next is not thread safe.
Two ramifications:
Quality of the randomness is destroyed by race conditions.
False sharing destroys scalability on multicores.
Several possible solutions:
Make random.Next thread safe: solves quality issue but not scalability.
Use multiple PRNGs: solves scalability issue but may degrade quality.
...

Related

Performance drop loop vs iterator

I am using kotlin in combination with lwjgl. So far I had the following code that ran several thousand times per second:
// val textureMap = HashMap<Int, Texture>()
fun bind() {
var index = 0
for(entry in textureMap) {
glActiveTexture(GL_TEXTURE0 + index)
entry.value.bind()
program.setInt(entry.key, index)
++index
}
}
So while this was running absolutely fast and consumed virtually 0 of my frame time as expected I had to replace it because it created an Iterator in every call, eventually leading to tens of thousands of those objects eventually getting garbage collected and halting my program for a few milliseconds which is of course not usable in my application.
So I went ahead and changed it to the following code:
// textures = ArrayList<Texture>()
// indices = ArrayList<Int>()
fun bind() {
var index = 0
while(index < textures.size) {
val uniform = indices[index]
val texture = textures[index]
glActiveTexture(GL_TEXTURE0 + index)
texture.bind()
program.setInt(uniform, index)
++index
}
}
Now for some reason I am noticing a massive drop in performance, namely the function now uses several seconds per frame. Using jvisualvm I was able to determine that all that time is spent in glActiveTexture in the native part as well as the native function in program.setInt(...). I am absolutely stumped why this is the case, especially after comparing the byte code of the two.
This is the decompiled class file for the first (fast) version:
public final void bind()
{
int index = 0;
Map localMap = (Map)this.textureMap;
for (Map.Entry entry : localMap.entrySet())
{
GL13.glActiveTexture(33984 + index);
((Texture)entry.getValue()).bind(); Program
tmp66_63 = this.program;
if (tmp66_63 == null) {
Intrinsics.throwUninitializedPropertyAccessException("program");
}
tmp66_63.setInt(((Number)entry.getKey()).intValue(), index);
index++;
}
}
And that is the byte code of the slow version:
public final void bind()
{
int index = 0;
while (index < this.textures.size())
{
Integer uniform = (Integer)this.indices.get(index);
Texture texture = (Texture)this.textures.get(index);
GL13.glActiveTexture(33984 + index);
texture.bind(); Program
tmp52_49 = this.program;
if (tmp52_49 == null) {
Intrinsics.throwUninitializedPropertyAccessException("program");
}
Integer tmp62_61 = uniform;Intrinsics.checkExpressionValueIsNotNull(tmp62_61, "uniform");tmp52_49.setInt(tmp62_61.intValue(), index);
index++;
}
}
I am extremely confused what is going on here. In both versions the call to glActiveTexture is GL_TEXTURE0 + <an int value>, yet one takes so much more time thatn the other.
Does anyone have an idea what I am missing here?
Basically my entire question can be removed. I should have debugged and not only profiled. The problem was the code that populated the lists, and it didnt remove the old values so the lists grew larger and larger and the loop just ran so many more times over time...
In case anyone was wondering how I fixed my problem with the allocations I essentially created two collections, one is containing the uniforms and one is mapping them to textures. And then I can iterate over the uniforms and then get the respective texture. So no pointless Iterator objects are created but I am also not having any duplicates :)

Timer Efficiency

I'm working on an AS3 project and for one of the effects I use timers to switch the colors then stop. The function is below.
//global variable
private var valueAnimationTimer:Timer = new Timer(50);
//constructor
valueAnimationTimer.addEventListener(TimerEvent.TIMER, scrollUp );
//function
private function scrollUp(e:TimerEvent):void
{
var i:int = e.currentTarget.currentCount as int;
if (i < 10)
{
if (colored){
if (i % 2 == 0){
ChangeColor(ico, flickerColor);
}
else{
ico.transform.colorTransform = new ColorTransform();
}
}
tfValue.y -= 7.5;
}
else
{
RemoveFilters(ico);
tfValue.y = ico.height / 2;
e.currentTarget.reset();
RemoveSprite(tfValue);
colored = false;
}
}
Each character (object) has it's own version of this function and it happens at different times (like when it is injured or poisoned). The listener is added once in the constructor, it is only removed when the character dies and is removed from the stage. The issue here is after the timer is used on at least 3 characters, the frame rate begins to drop. Every time the function is called, the frame rate drops lower and lower.
What I don't understand is, if the timer is stopped, and the listeners are only added once so it doesn't overload the stack, then why does the frame rate begin to decline after the listener is actually used? It doesn't run forever only for a small amount of time, but it happens again and again. When the frame rate drops the entire program begins to lag badly and eventually freezes. I have no idea what is causing this
Also be aware that inside of the Timer function, the first number is your count in MILLISECONDS and the second is repeat count
var fl_TimerInstance:Timer = new Timer(240000, 1);
So this example above is a 4 minute timer that repeats 1 time
I bring this up because yours is set for 50 milliseconds which is very quick lol

FMOD Ex dropping sounds, eventually going silent

I'm attempting to port an old open-source FMOD 3 game (Candy Crisis) to the latest version of FMOD Ex 4 on OS X. Its sound needs are very simpleā€”it plays WAVs, sometimes changing their frequency or speaker mix, and also plays MOD tracker music, sometimes changing the speed. I'm finding that the game works fine at first, but over the course of a few minutes, it starts truncating sounds early, then the music loses channels and eventually stops, then over time all sound ceases. I can cause the problem to reproduce more quickly if I lower the number of channels available to FMOD.
I can get the truncated/missing sounds issue to occur even if I never play a music file, but music definitely seems to make things worse. I have also tried commenting out the code which adjusts the sound frequency and speaker mix, and that was not the issue.
I am calling update() every frame.
Here's the entirety of my interactions with FMOD to play WAVs:
void InitSound( void )
{
FMOD_RESULT result = FMOD::System_Create(&g_fmod);
FMOD_ERRCHECK(result);
unsigned int version;
result = g_fmod->getVersion(&version);
FMOD_ERRCHECK(result);
if (version < FMOD_VERSION)
{
printf("Error! You are using an old version of FMOD %08x. This program requires %08x\n", version, FMOD_VERSION);
abort();
}
result = g_fmod->init(8 /* was originally 64, but 8 repros the issue faster */, FMOD_INIT_NORMAL, 0);
FMOD_ERRCHECK(result);
for (int index=0; index<kNumSounds; index++)
{
result = g_fmod->createSound(QuickResourceName("snd", index+128, ".wav"), FMOD_DEFAULT, 0, &s_sound[index]);
FMOD_ERRCHECK(result);
}
}
void PlayMono( short which )
{
if (soundOn)
{
FMOD_RESULT result = g_fmod->playSound(FMOD_CHANNEL_FREE, s_sound[which], false, NULL);
FMOD_ERRCHECK(result);
}
}
void PlayStereoFrequency( short player, short which, short freq )
{
if (soundOn)
{
FMOD::Channel* channel = NULL;
FMOD_RESULT result = g_fmod->playSound(FMOD_CHANNEL_FREE, s_sound[which], true, &channel);
FMOD_ERRCHECK(result);
result = channel->setSpeakerMix(player, 1.0f - player, 0, 0, 0, 0, 0, 0);
FMOD_ERRCHECK(result);
float channelFrequency;
result = s_sound[which]->getDefaults(&channelFrequency, NULL, NULL, NULL);
FMOD_ERRCHECK(result);
result = channel->setFrequency((channelFrequency * (16 + freq)) / 16);
FMOD_ERRCHECK(result);
result = channel->setPaused(false);
FMOD_ERRCHECK(result);
}
}
void UpdateSound()
{
g_fmod->update();
}
And here's how I play MODs.
void ChooseMusic( short which )
{
if( musicSelection >= 0 && musicSelection <= k_songs )
{
s_musicChannel->stop();
s_musicChannel = NULL;
s_musicModule->release();
s_musicModule = NULL;
musicSelection = -1;
}
if (which >= 0 && which <= k_songs)
{
FMOD_RESULT result = g_fmod->createSound(QuickResourceName("mod", which+128, ""), FMOD_DEFAULT, 0, &s_musicModule);
FMOD_ERRCHECK(result);
result = g_fmod->playSound(FMOD_CHANNEL_FREE, s_musicModule, true, &s_musicChannel);
FMOD_ERRCHECK(result);
EnableMusic(musicOn);
s_musicModule->setLoopCount(-1);
s_musicChannel->setPaused(false);
musicSelection = which;
s_musicPaused = 0;
}
}
If someone wants to experiment with this, let me know and I'll upload the project somewhere. My gut feeling is that FMOD is busted but I'd love to be proven wrong.
Sounds like your music needs to be set as higher priority than your other sounds. Remember, lower numbers are more important. I think you can just set the priority on the channel.
Every time I play the following WAV, FMOD loses one channel permanently. I am able to reproduce this channel-losing behavior in the "playsound" example if I replace the existing jaguar.wav with my file.
https://drive.google.com/file/d/0B1eDRY8sV_a9SXMyNktXbWZOYWs/view?usp=sharing
I contacted Firelight and got this response. Apparently WAVs can include a looping command! I had no idea.
Hello John,
I've taken a look at the two files you have provided. Both files end
with a 2 sample infinite loop region.
FMOD 4 (and FMOD 5 for that matter) will see the loop region in the
file and automatically enable FMOD_LOOP_NORMAL if you haven't
specified any loop mode. Assuming you want one-shot behavior just pass
in FMOD_LOOP_OFF when you create the sound.
Kind regards, Mathew Block | Senior Platform Engineer
Technically this behavior contradicts the documented behavior of FMOD_DEFAULT (which is specified to imply FMOD_LOOP_OFF) so they are planning to improve the documentation here.
Based on the wave sample you supplied, FMOD is behaving correctly as it appears you've figured out. The sample has a loop that is honored by FMOD and the last samples are simply repeated forever. While useless, this is correct and the variance in the samples is so slight as to not be audible. While not part of the original spec for wave format, extended information was added later to support meta data such as author, title, comments and multiple loop points.
Your best bet is to examine all your source assets for those that contain loop information. Simply playing all sounds without loop information is probably not the best workaround. Some loops may be intentional. Those that are will have code that stops them. Typically, in a game, the entire waveform is looped when looping is desired. You can then write or use a tool that will strip the loop information. If you do write your own tool, I'd recommend resampling the audio to the native output sampling rate of the hardware. You'd need to insure your resampler was sample accurate (no time shift) and did not introduce noise.
Historically, some game systems had a section at the end of the sound with silence and a loop point set on this region. The short reason for this was to reduce popping that might occur at the end of a sound in a hardware audio channel.
Curiosly, the last 16 samples of your .wav look like garbage and I'm wondering if the .wav assets you're using were converted from a source meant for a game console and that's where the bogus loop information came from as well.
This would have been a comment but my lowly rep does not allow it.

1-2-3 star level awards algorithm

What I am trying to do is to have individual star counts per level based on player performance. (1-2-3 star awards.) This will be based on what region the player reaches. I know how to award the stars but keeping track of it all is throwing me problems. First lets say a player plays level 2 and receives 1 star for their performance. Then at a later time, s/he returns to the level and gets a 2 star. I would like the star count for that specific scene to update to two stars, while only adding 1 star ( The one extra s/he got this time) to the totalStarCount.
My initial plan was to have variables:
OldStarCount
NewStarCount
TotalStarCount
Then when a player reaches say region1, and is awarded one star, then NewStarCount would be set to one, then
TotalStarCount = TotalStarCount + (NewStarCount - OldStarCount);
Then update OldStarCount = NewStarCount;
Set NewStarCount = 0;
Move On to next Scene;
Am I approaching this the correct way? Any help would be greatly appreciated.
You could have something like this
int result = 0;
int totalStars = 0;
int[] starCounts = new int[NumberOfRegions};
...
currentRegion = 42;
result = play(currentRegion);
if(result > starCounts[currentRegion]){
totalStars += result - starCounts[currentRegion];
starCounts[currentRegion] = result;
}
This is just an example of what you could do. There are obvious scalability issues with this (what happens when you want to add new regions, etc), but you get the gist.

Write code to make CPU usage display a sine wave

Write code in your favorite language
and let Windows Task Manager represent
a sine wave in CPU Usage History.
This is a technical interview quiz from Microsoft China.
I think it's a good question. Especially it's worth knowing how candidate understand and figure out the solution.
Edit: It's a good point if may involve multi-core(cpu) cases.
A thread time slice in Windows is 40ms, iirc, so that might be a good number to use as the 100% mark.
unsigned const TIME_SLICE = 40;
float const PI = 3.14159265358979323846f;
while(true)
{
for(unsigned x=0; x!=360; ++x)
{
float t = sin(static_cast<float>(x)/180*PI)*0.5f + 0.5f;
DWORD busy_time = static_cast<DWORD>(t*TIME_SLICE);
DWORD wait_start = GetTickCount();
while(GetTickCount() - wait_start < busy_time)
{
}
Sleep(TIME_SLICE - busy_time);
}
}
This would give a period of about 14 seconds. Obviously this assumes there is no other significant cpu usage in the system, and that you are only running it on a single CPU. Neither of these is really that common in reality.
Here's a slightly modified #flodin's solution in Python:
#!/usr/bin/env python
import itertools, math, time, sys
time_period = float(sys.argv[1]) if len(sys.argv) > 1 else 30 # seconds
time_slice = float(sys.argv[2]) if len(sys.argv) > 2 else 0.04 # seconds
N = int(time_period / time_slice)
for i in itertools.cycle(range(N)):
busy_time = time_slice / 2 * (math.sin(2*math.pi*i/N) + 1)
t = time.perf_counter() + busy_time
while t > time.perf_counter():
pass
time.sleep(time_slice - busy_time);
A CPU-curve can be fine-tuned using time_period and time_slice parameters.
Ok I have a different, probably BETTER solution than my first answer.
Instead of trying to manipulate the CPU, instead hook into the task manager app, force it to draw what you want it to instead of CPU results. Take over the GDI object that plots the graph, etc. Sort of "Cheating" but they didnt say you had to manipulate the CPU
Or even hook the call from task manager that gets the CPU %, returning a sine result instead.
With the literally hundreds (thousands?) of threads a PC runs today, the only way I can think to even come close would be to poll CPU usage as fast as possible, and if the usage% was below where it should be on the curve, to fire off a short method that just churns numbers. That will at least bring the typical low usage UP where needed, but I can't think of a good way to LOWER it without somehow taking control of other threads, and doing something such as forcing thier priority lower.
Something like this:
while(true)
{
for(int i=0;i<360;i++)
{
// some code to convert i into radians if needed
...
Thread.Sleep(Math.Sin(i)*something_that_makes_it_noticeable_number_of_ms+something_that_makes_it_non_negative)
// some work to make cpu busy, may be increased to bigger number to see the influence on the cpu.
for(j=0;j<100;j++);
}
}

Resources