can persistent communication be used in the for loop? - for-loop

I want to change MPI_Isend() and MPI_Irecv() to persist communication to reduce time of my parallel program.
But variables in communication process will be changed during the iteration, so I alter my code in the communicaiton function as follows.
My questions are:
Is it persistent communication after modification?
If it is not persistent communication after modification,why did the time of program go down while the result of computation is correct?
int main()
{
for()
{
//calculation process 1 will change value of q1,q2,b1,b2,b3,b4
....
//communication process
void comm()
{
/*if(rightBoundary==virtualWall)
{
rightProcessNum=processNum+decompositionY*decompositionZ;
for(int k=1;k<nz+1;k++)
for(int j=1;j<ny+1;j++)
{
MPI_Isend(&q1[n][k][j][nx],1,MPI_DOUBLE,rightProcessNum,20,MPI_COMM_WORLD,&req1[0]);
MPI_Isend(&b1 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,21,MPI_COMM_WORLD,&req1[1]);
MPI_Isend(&b2 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,22,MPI_COMM_WORLD,&req1[2]);
MPI_Isend(&b2 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,23,MPI_COMM_WORLD,&req1[3]);
MPI_Isend(&b4 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,24,MPI_COMM_WORLD,&req1[4]);
MPI_Isend(&q2[n][k][j][nx],1,MPI_DOUBLE,rightProcessNum,25,MPI_COMM_WORLD,&req1[5]);
MPI_Waitall(6,req1,sta1);
}
}
if(leftBoundary==virtualWall)
{
leftProcessNum=processNum-decompositionY*decompositionZ;
for(int k=1;k<nz+1;k++)
for(int j=1;j<ny+1;j++)
{
MPI_Irecv(&q1[n][k][j][0],1,MPI_DOUBLE,leftProcessNum,20,MPI_COMM_WORLD,&req2[0]);
MPI_Irecv(&b1 [k][j][0],1,MPI_DOUBLE,leftProcessNum,21,MPI_COMM_WORLD,&req2[1]);
MPI_Irecv(&b2 [k][j][0],1,MPI_DOUBLE,leftProcessNum,22,MPI_COMM_WORLD,&req2[2]);
MPI_Irecv(&b3 [k][j][0],1,MPI_DOUBLE,leftProcessNum,23,MPI_COMM_WORLD,&req2[3]);
MPI_Irecv(&b4 [k][j][0],1,MPI_DOUBLE,leftProcessNum,24,MPI_COMM_WORLD,&req2[4]);
MPI_Irecv(&b5[n][k][j][0],1,MPI_DOUBLE,leftProcessNum,25,MPI_COMM_WORLD,&req2[5]);
MPI_Waitall(6,req2,sta2);
}
}*/
if(rightBoundary==virtualWall)
{
rightProcessNum=processNum+decompositionY*decompositionZ;
for(int k=1;k<nz+1;k++)
{
for(int j=1;j<ny+1;j++)
{
MPI_Send_init(&q1[n][k][j][nx],1,MPI_DOUBLE,rightProcessNum,20,MPI_COMM_WORLD,&req1[0]);
MPI_Send_init(&b1 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,21,MPI_COMM_WORLD,&req1[1]);
MPI_Send_init(&b2 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,22,MPI_COMM_WORLD,&req1[2]);
MPI_Send_init(&b3 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,23,MPI_COMM_WORLD,&req1[3]);
MPI_Send_init(&b4 [k][j][nx],1,MPI_DOUBLE,rightProcessNum,24,MPI_COMM_WORLD,&req1[4]);
MPI_Send_init(&q2[n][k][j][nx],1,MPI_DOUBLE,rightProcessNum,25,MPI_COMM_WORLD,&req1[5]);
MPI_Startall(6,req1);
}
}
MPI_Waitall(6,req1,sta1);
MPI_Request_free(req1);
}
if(leftBoundary==virtualWall)
{
leftProcessNum=processNum-decompositionY*decompositionZ;
for(int k=1;k<nz+1;k++)
{
for(int j=1;j<ny+1;j++)
{
MPI_Recv_init(&q1[n][k][j][0],1,MPI_DOUBLE,leftProcessNum,20,MPI_COMM_WORLD,&req2[0]);
MPI_Recv_init(&b1 [k][j][0],1,MPI_DOUBLE,leftProcessNum,21,MPI_COMM_WORLD,&req2[1]);
MPI_Recv_init(&b2 [k][j][0],1,MPI_DOUBLE,leftProcessNum,22,MPI_COMM_WORLD,&req2[2]);
MPI_Recv_init(&b3 [k][j][0],1,MPI_DOUBLE,leftProcessNum,23,MPI_COMM_WORLD,&req2[3]);
MPI_Recv_init(&b4 [k][j][0],1,MPI_DOUBLE,leftProcessNum,24,MPI_COMM_WORLD,&req2[4]);
MPI_Recv_init(&q2[n][k][j][0],1,MPI_DOUBLE,leftProcessNum,25,MPI_COMM_WORLD,&req2[5]);
MPI_Startall(6,req2);
}
}
MPI_Waitall(6,req2,sta2);
MPI_Request_free(req2);
}
}
//calculation process 2
....
}
}
I appreciate your time and suggestion in this problem.

Related

Akka.Net BecomeStacked/UnbecomeStacked behavior issue

I have a problem with the behavior switch model.
I have a simple receive actor with 2 behaviors: Ready & DoJob.
The Ready one contains a message handler plus one instruction I need to be evaluated at each behavior switch (cpt++).
Below is the code of the actor:
public class BecomeUnbecome : ReceiveActor
{
private int cpt=0;
public BecomeUnbecome()
{
this.Become(this.Ready);
}
public void Ready()
{
cpt++;
Receive<BeginWork>(msg =>
{
Console.WriteLine($"Go and work!");
BecomeStacked(this.DoJob);
});
}
public void DoJob()
{
Receive<Work>(msg =>
{
Console.WriteLine("Start working...");
Console.WriteLine($"Counter: {cpt}\nWork done\n");
UnbecomeStacked();
});
}
}
The main code is:
int counter = 0;
while (counter < 10)
{
actor.Tell(new BeginWork());
actor.Tell(new Work());
counter++;
}
The program execution shows cpt++ in Ready() is evaluated once next to the call to Become in the constructor.
I cannot find any reasonable workaround to that.
Does anyone have any idea ?

why do I get two events from particle.publish?

I am using code like this on a particle electron to report pulse counts from a flow meter on my kegerator to the particle cloud:
void meterInterrupt(void) {
detachInterrupt(pin);
ticks++;
cloudPending = 1;
attachInterrupt(pin, meterInterrupt, FALLING);
}
void publishStatus() {
if (!cloudPending) {
return;
}
cloudPending = 0;
getStatus(&statusMessage);
// status message contains number of ticks since last publish
bool published = Particle.publish("Ticks", statusMessage, PRIVATE);
if (published) {
resetMeters();
lastPublish = millis();
}
}
void loop() {
if ((millis() - lastPublish) >= 1000) {
publishStatus();
}
}
When I curl the event log into my terminal, I see two events for the first publish like so:
event: Ticks
data: {"data":"ticks:1","ttl":60,"published_at":"2018-07-03T22:35:01.008Z","coreid":"420052000351353337353037"}
event: hook-sent/Ticks
data: {"data":"","ttl":60,"published_at":"2018-07-03T22:35:01.130Z","coreid":"particle-internal"}
event: Ticks
data: {"data":"ticks:46","ttl":60,"published_at":"2018-07-03T22:35:01.193Z","coreid":"420052000351353337353037"}
event: hook-sent/Ticks
data: {"data":"","ttl":60,"published_at":"2018-07-03T22:35:01.303Z","coreid":"particle-internal"}
I don't see how this could happen. Why didn't it just report "ticks:47"? What am I missing?
UPDATE:
I did some further testing and noticed that Particle.publish is returning false the first time when it is actually completing successfully. Is this a timeout issue? The time difference between these publishes is only about 200ms.
OK, This is at least a partial answer.
It appears that Particle.publish is asynchronous. It returns the promise of an answer that starts out as false only eventually becomes true when/if the action is actually completed. If I wait an indeterminate amount of time (say delay(10)) after Particle.publish and before checking the return code, the return value will indicate the actual success or failure of the publish. My code cannot work because the ticks that are counted while I wait will be deleted when I reset the meters. WITH_ACK gives me the same behavior.
I will have to modify my code such that no ticks are lost during the long running Particle.publish . I am thinking that each statusMessage should go onto a list until it is ack'ed by the server.
FINAL ANSWER:
I modified the code to close the window during which I can receive ticks that will then be wiped out when I reset the counters. I do this by capturing the ticks into an array and then resetting the tick counter (meter). I am using a library called PublishQueueAsyncRK (cudos to rickkas7 This library is great!) so I can just fire it and forget it. Check it out on github.
void publishStatus() {
unsigned int counters[NUM_METERS];
unsigned int pending;
for (int i = 0; i < NUM_METERS; i++) {
meter_t *meter = &meters[i];
counters[i] = meter->ticks;
pending += counters[i];
resetMeter(i);
}
if (pending) {
String statusReport;
for (int i = 0; i < NUM_METERS; i++) {
statusReport.concat(String::format("%i:%u|", i+1, counters[i]));
}
publishReport(statusReport);
lastPublished = millis();
}
}
void publishReport(String report) {
if (report != "") {
publishQueue.publish("PourLittleTicks", report, PRIVATE);
}
}
void loop() {
if ((millis() - lastPublished) >= PUBLISH_INTERVAL) {
publishStatus();
}
}

Correct way of synchronization between a method and a stop functionality

I have a function (lets call it function A) that 0 to many threads can access it (at the same time, no shared resources). At any given time, the user can use to stop the process. The stop functionality needs to make sure that there are threads accessing function A, so that a graceful shutdown can be performed. Is there a native procedure to do so?
What I was going to do is have an InterlockedIncrement an integer everytime function A is called (and a corresponding InterlockedDecrement on said integer when function A exists). When an InterlockedDecrement takes place, it checks the value of the integer, if it's set to zero, a event is set to signalled. If the value is not zero, the event is set to nonsignalled.
This makes sense in my mind, but I'm curious whether there is a more native structure / functionality adapted to do so.
I still have to thing about the fact the "stop" function may get starved (in the sense, the said integer may never be set to zero). A sidenote: when the stop event takes place, the InterlockedIncrement process shall be stopped, to reduce said starvation.
what you need and want implement is called Run-Down Protection. unfortunately it supported only in kernel mode, but not hard implement it yourself in user mode too.
the simplest implementation is next:
HANDLE ghStopEvent;
LONG gLockCount = 1;
BOOLEAN bStop = FALSE;
void unlock()
{
if (!InterlockedDecrement(&gLockCount)) SetEvent(ghStopEvent);
}
BOOL lock()
{
LONG Value = gLockCount, NewValue;
for ( ; !bStop && Value; Value = NewValue)
{
NewValue = InterlockedCompareExchange(&gLockCount, Value + 1, Value);
if (NewValue == Value) return TRUE;
}
return FALSE;
}
void funcA();
void UseA()
{
if (lock())
{
funcA();
unlock();
}
}
and when you want begin rundown - once call
bStop = TRUE; unlock();
how you can see lock function is interlocked increment gLockCount on 1 but only if it not 0.
in kernel mode you can call instead
EX_RUNDOWN_REF gRunRef;
void UseA()
{
if (ExAcquireRundownProtection(&gRunRef))
{
funcA();
ExReleaseRundownProtection(&gRunRef)
}
}
and on place final unlock - ExWaitForRundownProtectionRelease
some more complex and scalable implementation of rundown-protection:
#define RUNDOWN_INIT_VALUE 0x80000000
#define RUNDOWN_COMPLETE_VALUE 0
class __declspec(novtable) RUNDOWN_REF
{
LONG _LockCount;
protected:
virtual void RundownCompleted() = 0;
public:
BOOL IsRundownBegin()
{
return 0 <= _LockCount;
}
void Reinit()
{
if (InterlockedCompareExchange(&_LockCount, RUNDOWN_INIT_VALUE, RUNDOWN_COMPLETE_VALUE) != RUNDOWN_COMPLETE_VALUE)
{
__debugbreak();
}
}
RUNDOWN_REF()
{
_LockCount = RUNDOWN_INIT_VALUE;
}
BOOL AcquireRundownProtection()
{
LONG Value = _LockCount, NewValue;
for ( ; Value < 0; Value = NewValue)
{
NewValue = InterlockedCompareExchange(&_LockCount, Value + 1, Value);
if (NewValue == Value) return TRUE;
}
return FALSE;
}
void ReleaseRundownProtection()
{
if (RUNDOWN_COMPLETE_VALUE == InterlockedDecrement(&_LockCount))
{
RundownCompleted();
}
}
void BeginRundown()
{
if (AcquireRundownProtection())
{
_interlockedbittestandreset(&_LockCount, 31);
ReleaseRundownProtection();
}
}
};
and use it like:
class MY_RUNDOWN_REF : public RUNDOWN_REF
{
HANDLE _hEvent;
virtual void RundownCompleted()
{
SetEvent(_hEvent);
}
// ...
} gRunRef;
void UseA()
{
if (gRunRef.AcquireRundownProtection())
{
funcA();
gRunRef.ReleaseRundownProtection();
}
}
and when you want stop:
gRunRef.BeginRundown();// can be safe called multiple times
// wait on gRunRef._hEvent here
interesting that in kernel exist else one (more old - from win2000, when rundown protection from xp) api Remove Locks. it do almost the same. different only in internal implementation and usage. with remove locks code will be look like this:
IO_REMOVE_LOCK gLock;
void UseA()
{
if (0 <= IoAcquireRemoveLock(&gLock, 0))
{
funcA();
IoReleaseRemoveLock(&gLock, 0);
}
}
and when we want stop - call
IoAcquireRemoveLock(&gLock, 0);
IoReleaseRemoveLockAndWait(&gLock, 0);
my first code spinet by implementation near remove locks implementation, when second near rundown-protection implementation. but by sense both do the same

alBufferData() sets AL_INVALID_OPERATION when using buffer ID obtained from alSourceUnqueueBuffers()

I am trying to stream audio data from disk using OpenAL's buffer queueing mechanism. I load and enqueue 4 buffers, start the source playing, and check in a regular intervals to refresh the queue. Everything looks like it's going splendidly, up until the first time I try to load data into a recycled buffer I got from alSourceUnqueueBuffers(). In this situation, alBufferData() always sets AL_INVALID_OPERATION, which according to the official v1.1 spec, it doesn't seem like it should be able to do.
I have searched extensively on Google and StackOverflow, and can't seem to find any reason why this would happen. The closest thing I found was someone with a possibly-related issue in an archived forum post, but details are few and responses are null. There was also this SO question with slightly different circumstances, but the only answer's suggestion does not help.
Possibly helpful: I know my context and device are configured correctly, because loading small wav files completely into a single buffer and playing them works fine. Through experimentation, I've also found that queueing 2 buffers, starting the source playing, and immediately loading and enqueueing the other two buffers throws no errors; it's only when I've unqueued a processed buffer that I run into trouble.
The relevant code:
static constexpr int MAX_BUFFER_COUNT = 4;
#define alCall(funcCall) {funcCall; SoundyOutport::CheckError(__FILE__, __LINE__, #funcCall) ? abort() : ((void)0); }
bool SoundyOutport::CheckError(const string &pFile, int pLine, const string &pfunc)
{
ALenum tErrCode = alGetError();
if(tErrCode != 0)
{
auto tMsg = alGetString(tErrCode);
Log::e(ro::TAG) << tMsg << " at " << pFile << "(" << pLine << "):\n"
<< "\tAL call " << pfunc << " failed." << end;
return true;
}
return false;
}
void SoundyOutport::EnqueueBuffer(const float* pData, int pFrames)
{
static int called = 0;
++called;
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
if(tState == AL_STATIC)
{
Stop();
// alCall(alSourcei(mSourceId, AL_BUFFER, NULL));
}
ALuint tBufId = AL_NONE;
int tQueuedBuffers = QueuedUpBuffers();
int tReady = ProcessedBuffers();
if(tQueuedBuffers < MAX_BUFFER_COUNT)
{
tBufId = mBufferIds[tQueuedBuffers];
}
else if(tReady > 0)
{
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
}
else
{
return; // no update needed yet.
}
void* tConverted = convert(pData, pFrames);
// the fifth time through, we get AL_INVALID_OPERATION, and call abort()
alCall(alBufferData(tBufId, mFormat, tConverted, pFrames * mBitdepth/8, mSampleRate));
alCall(alSourceQueueBuffers(mSourceId, 1, &mBufferId));
if(mBitdepth == BITDEPTH_8)
{
delete (uint8_t*)tConverted;
}
else // if(mBitdepth == BITDEPTH_16)
{
delete (uint16_t*)tConverted;
}
}
void SoundyOutport::PlayBufferedStream()
{
if(!StreamingMode() || !QueuedUpBuffers())
{
Log::w(ro::TAG) << "Attempted to play an unbuffered stream" << end;
return;
}
alCall(alSourcei(mSourceId, AL_LOOPING, AL_FALSE)); // never loop streams
alCall(alSourcePlay(mSourceId));
}
int SoundyOutport::QueuedUpBuffers()
{
int tCount = 0;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_QUEUED, &tCount));
return tCount;
}
int SoundyOutport::ProcessedBuffers()
{
int tCount = 0;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_PROCESSED, &tCount));
return tCount;
}
void SoundyOutport::Stop()
{
if(Playing())
{
alCall(alSourceStop(mSourceId));
}
int tBuffers;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_QUEUED, &tBuffers));
if(tBuffers)
{
ALuint tDummy[tBuffers];
alCall(alSourceUnqueueBuffers(mSourceId, tBuffers, tDummy));
}
alCall(alSourcei(mSourceId, AL_BUFFER, AL_NONE));
}
bool SoundyOutport::Playing()
{
ALint tPlaying;
alCall(alGetSourcei(mSourceId, AL_SOURCE_STATE, &tPlaying));
return tPlaying == AL_PLAYING;
}
bool SoundyOutport::StreamingMode()
{
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
return tState == AL_STREAMING;
}
bool SoundyOutport::StaticMode()
{
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
return tState == AL_STATIC;
}
And here's an annotated screen cap of what I see in my debugger when I hit the error:
I've tried a bunch of little tweaks and variations, and the result is always the same. I've wasted too many days trying to fix this. Please help :)
This error occurs when you trying to fill buffer with data, when the buffer is still queued to the source.
Also this code is wrong.
if(tQueuedBuffers < MAX_BUFFER_COUNT)
{
tBufId = mBufferIds[tQueuedBuffers];
}
else if(tReady > 0)
{
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
}
else
{
return; // no update needed yet.
}
You can fill buffer with data only if it unqueued from source. But your first if block gets tBufId that queued to the source. Rewrite code like so
if(tReady > 0)
{
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
}
else
{
return; // no update needed yet.
}

Removing From ArrayList, In Loop Based On It's Size, But Breaking After Remove Still Gives OutOfBounds

Alright so I remove an object from an array list, then break, but I still get OutOfBounds, I'm kinda confused, could someone help me, I've tried to isolate the problem, but I still can't figure it out.
Here is the error I get:
Exception in thread "Thread-2" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at GameFunctions.closestTarget(GameFunctions.java:37)
at GameFunctions.act(GameFunctions.java:147)
at GamePanel$1.run(GamePanel.java:50)
Here are the two methods causing me problems:
public void act(ArrayList<Germs> g, ArrayList<WhiteBloodCell> p, String actor)
{
if(actor.equals("Germs"))
{
for(int i=0;i<g.size();i++)
{
if(g.get(i) instanceof SPneumoniae)
{
g.get(i).move();
g.get(i).testBounds();
if(checkSize(g, i))
{
System.out.println("broken");
break;
}
}
}
}
else if(actor.equals("WhiteBloodCells"))
{
for(int i=0;i<p.size();i++)
{
p.get(i).setTarget(closestTarget(g, p.get(i)));
p.get(i).move();
}
}
}
And here is the method called that's removing the object:
public boolean checkSize(ArrayList<Germs> g, int i)
{
if(g.get(i).getRadius() > 30)
{
g.get(i).setRadius(30);
}
else if(g.get(i).getRadius() <= 0)
{
g.remove(i);
return true;
}
return false;
}
It looks like the error is due to there being nothing in the ArrayList of g.
Check out this area of your code:
else if(actor.equals("WhiteBloodCells"))
{
for(int i=0;i<p.size();i++)
{
p.get(i).setTarget(closestTarget(g, p.get(i)));
p.get(i).move();
}
}
See if that gives you any leads.
Edit -- The bug IS coming from the closestTarget function based on the exceptions listed.

Resources