Here my first steps with Oracle Advanced Queueing...
Szenario: I have a running application where many, many multiple independ processes report back to a central controller to handle the next steps. Simplified the processes are started via cron or via callback of a just finished process.The callbacks are from remote hosts via http -> php -> DB, basicly one http-call after the process has finished on the remote host.
The complete controller logic was written in pl/sql with a singleton concept in mind, so only one process should execute the controller logic at the same time. In fact in 99% of all calls this is not necessary, but that's not the kind of thing I could change at the moment (nor the architecture in general).
To ensure this there is actually a bad mutex implementation, pseudo-code
$mutex = false;
while( not $mutex )
{
$mutex = getMutex();
if( $mutex )
executeController();
else
sleep(5);
}
Wherein the mutex is a one field table having the values 0 (=> "free") or 1 ( => "busy" )
The result of this "beautiful" contstruction is log-file full of "Hey! Got no mutex! Waiting...". And the more processes wait, the longer they wait with no control of who's next. Sometimes the load gets so heavy that the apache first forks and finally dies...
Solution
So my first "operation" would be to replace the mutex with Oracle Advanced Queueing with the controller as single-consumer. Benefits: No more "busy waiting" within the apache layer, strict first come first serve.
( Because all the DB-Actions take place in the same oracle-schema, this could be achieved with standard-objects, pl/sql-methods as well. But why reinvent the wheel, if there are dbms-packages?)
As far as I read using the listen-feature (polling the queued items) in this context is far better than the registration-feaure (scheduling an action when a message arrives).
Basicly everything works fine, i managed to:
create the message type
create the queue-table
create the queue
start the queue
add USER as subscriber
create a procedure for enqueueing
create a procedure for processing & dequeueing
create a procedure for listening to the queue and calling the "process & dequeue"-function when a message arrives.
Of course the listener shall be active 24/7, so i specified no "wait" time. In general depending on the time of the day he will get "something to do" at least every few minutes, more likely every few seconds, sometimes more.
Now here is my problem (if it actually is a problem), i just wrote it according to the examples i found so far:
CREATE OR REPLACE PROCEDURE demo_aq_listener IS
qlist dbms_aq.aq$_agent_list_t;
agent_w_msg sys.aq$_agent;
BEGIN
qlist(0) := sys.aq$_agent(USER, 'demo_aq_queue', NULL);
LOOP
dbms_aq.listen(agent_list => qlist, agent => agent_w_msg);
DEMO_AQ_DEQUEUE();--process & dequeue
END LOOP;
END;
/
Calling the procedure basically does what i expect: It stays "up" and prosseces the queued messages.
But is this the way to do this? What does it do if there are no queued messages? "Sleeping" within the dbms_aq.listen-routine or "Looping as fast as it can", so that I just have implemented another way of "busy waiting"? Might there be a timeout (maybe on oss-level or elsewhere) i just didn't reach?
Here is the complete code with queue-definition etc.: demo_dbms_aq_with_listener.sql
UPDATE
Through further testing i just realized that it seems, that i got a far greater lack of understanding then i hoped :(
On "execution level" don't using the listener at all and just looping the dequeue function has the same effect: It waits for the first/next message
CREATE OR REPLACE PROCEDURE demo_aq_listener IS
BEGIN
LOOP
DEMO_AQ_DEQUEUE();
END LOOP;
END;
/
At least this is easier to test, calling only
BEGIN
DEMO_AQ_DEQUEUE();
END;
/
Also just waits for the first message. Which leaves me totally confused wether I need the listener at all and if what i'am doing does make any sense at all :(
Conclusion
I don't need the listener at all, because i have a single consumer who can treat all messages in the same way.
But the key/core Question stays the same: Is it ok to keep DBMS_AQ.DEQUEUE on "maybe active waiting" in a loop knowing it'll get messages all day long in short intervalls?
(you'll find DEMO_AQ_DEQUEUE() in linked sql-file above)
Better late than never, everything's fine, it is idle waiting:
1) Whilst the DEQUEUE is in sleep mode (WAIT FOREVER), I can see the session is waiting on the event - "Streams AQ: waiting for messages in the queue", that is an IDLE wait class and not actually consuming ANY resources, correct ?
Correct. It's similar to waiting on a row lock on a table. You just "sit there"
https://asktom.oracle.com/pls/apex/asktom.search?tag=writing-a-stand-alone-application-to-continuously-monitor-a-database-queue-aq
Related
Is there a special "wait for event" function that can wait for 3 queues at the same time at device side so it doesn't wait for all queues serially from host side?
Is there a checkpoint command to send into a command queue such that it must wait for other command queues to hit same(vertically) barrier/checkpoint to wait and continue from device side so no host-side round-trip is needed?
For now, I tried two different versions:
clWaitForEvents(3, evt_);
and
int evtStatus0 = 0;
clGetEventInfo(evt_[0], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus0, NULL);
while (evtStatus0 > 0)
{
clGetEventInfo(evt_[0], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus0, NULL);
Sleep(0);
}
int evtStatus1 = 0;
clGetEventInfo(evt_[1], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus1, NULL);
while (evtStatus1 > 0)
{
clGetEventInfo(evt_[1], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus1, NULL);
Sleep(0);
}
int evtStatus2 = 0;
clGetEventInfo(evt_[2], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus2, NULL);
while (evtStatus2 > 0)
{
clGetEventInfo(evt_[2], CL_EVENT_COMMAND_EXECUTION_STATUS,
sizeof(cl_int), &evtStatus2, NULL);
Sleep(0);
}
second one is a bit faster(I saw it from someone else) and both are executed after 3 flush commands.
Looking at CodeXL profiler results, first one waits longer between finish points and some operations don't even seem to be overlapping. Second one shows 3 finish points are all within 3 milliseconds so it is faster and longer parts are overlapped(read+write+compute at the same time).
If there is a way to achieve this with only 1 wait command from host side, there must a "flush" version of it too but I couldn't find.
Is there any way to achieve below picture instead of adding flushes between each pipeline step?
queue1 write checkpoint write checkpoint write
queue2 - compute checkpoint compute checkpoint compute
queue3 - checkpoint read checkpoint read
all checkpoints have to be vertically synchronized and all these actions must not start until a signal is given. Such as:
queue1.ndwrite(...);
queue1.ndcheckpoint(...);
queue1.ndwrite(...);
queue1.ndcheckpoint(...);
queue1.ndwrite(...);
queue2.ndrangekernel(...);
queue2.ndcheckpoint(...);
queue2.ndrangekernel(...);
queue2.ndcheckpoint(...);
queue2.ndrangekernel(...);
queue3.ndread(...);
queue3.ndcheckpoint(...);
queue3.ndread(...);
queue3.ndcheckpoint(...);
queue3.ndread(...);
queue1.flush()
queue2.flush()
queue3.flush()
queue1.finish()
queue2.finish()
queue3.finish()
checkpoints are all handled in device side and only 3 finish commands are needed from host side(even better,only 1 finish for all queues?)
How I bind 3 queues to 3 events with "clWaitForEvents(3, evt_);" for now is:
hCommandQueue->commandQueue.enqueueBarrierWithWaitList(NULL, &evt[0]);
hCommandQueue2->commandQueue.enqueueBarrierWithWaitList(NULL, &evt[1]);
hCommandQueue3->commandQueue.enqueueBarrierWithWaitList(NULL, &evt[2]);
if this "enqueue barrier" can talk with other queues, how could I achieve that? Do I need to keep host-side events alive until all queues are finished or can I delete them or re-use them later? From the documentation, it seems like first barrier's event can be put to second queue and second one's barrier event can be put to third one along with first one's event so maybe it is like:
hCommandQueue->commandQueue.enqueueBarrierWithWaitList(NULL, &evt[0]);
hCommandQueue2->commandQueue.enqueueBarrierWithWaitList(evt_0, &evt[1]);
hCommandQueue3->commandQueue.enqueueBarrierWithWaitList(evt_0_and_1, &evt[2]);
in the end wait for only evt[2] maybe or using only 1 same event for all:
hCommandQueue->commandQueue.enqueueBarrierWithWaitList(sameEvt, &evt[0]);
hCommandQueue2->commandQueue.enqueueBarrierWithWaitList(sameEvt, &evt[1]);
hCommandQueue3->commandQueue.enqueueBarrierWithWaitList(sameEvt, &evt[2]);
where to get sameEvt object?
anyone tried this? Should I start all queues with a barrier so they dont start until I raise some event from host side or lazy-executions of "enqueue" is %100 trustable to "not to start until I flush/finish" them? How do I raise an event from host to device(sameEvt doesn't have a "raise" function, is it clCreateUserEvent?)?
All 3 queues are in-order type and are in same context. Out-of-order type is not supported by all graphics cards. C++ bindings are being used.
Also there are enqueueWaitList(is this deprecated?) and clEnqueueMarker but I don't know how to use them and documentation doesn't have any example in Khronos' website.
You asked too many questions and expressed too many variants to provide you with the only solution, so I will try to answer in general that you can figure out the most suitable solution.
If the queues are bind to the same context (possibly to different devices within the same context) than it is possible to synchronize them through the events. I.e. you can obtain an event from a command submitted to one queue and use this event to synchronize a command submitted to another queue, e.g.
queue1.enqueue(comm1, /*dependency*/ NULL, /*result event*/ &e1);
queue2.enqueue(comm2, /*dependency*/ &e1, /*result event*/ NULL);
In this example, comm2 will wait for comm1 completion.
If you need to enqueue commands first but no to allow them to be executed you can create user event (clCreateUserEvent) and signal it manually (clSetUserEventStatus). The implementation is allowed to process command as soon as they enqueued (the driver is not required to wait for the flush).
The barrier seems overkill for your purpose because it waits for all commands previously submitted to the queue. You can really use clEnqueueMarker that can be used to wait for all events and provide one event to be used for other commands.
As far as I know you can retain the event at any moment if you do not need it more. The implementation should prolong the event life-time if it is required for internal purposes.
I do not know what is enqueueWaitList.
Off-topic: if you need non-trivial dependencies between calculations you may want to consider TBB flow graph and opencl_node. The opencl_node uses events for syncronization and avoids "host-device" synchronizations if possible. However, it can be tricky to use multiple queues for the same device.
As far as I know, Intel HD Graphics 530 supports out-of-order queues (at least host-side).
You are making it much harder than it needs to be. On the write queue take an event. Use that as a condition for the compute on the compute queue, and take another event. Use that as a condition on the read on the read queue. There is no reason to force any other synchronization. Note: My interpretation of the spec is that you must clFlush on a queue that you took an event from before using that event as a condition on another queue.
This is Oracle 11.2.0.3.
We've got a problem where we use Oracle's JMS over OracleAQ. This works fine except we started noticing the queue getting filled with 1000s, then millions of messages over time. Some of these are in the PROCESSED state, but most are READY. We traced down this behavior to "zombie" or dead subscribers to the topic. When a Java process is terminated and doesn't get the chance to unregister itself, it leaves the subscriber record in the queue and ORacle doesn't seem to detect that it is dead. So much so that MONTHS later, a new message sent into our multi-subscriber queue will then get multiplied by the # of subscribers, which it thinks is much higher than it actually is. (We first noticed this when we reached the maximum subscriber limit.)
We've got the qmon processes running - I even tried increasing the minimum # of processes to no effect. The queue clean-up happens really nicely as long as there are no dead subscribers in the queue.
Anyone see this before, and hopefully found a solution?
Ok, So I could not have a better solution than this:
1) Create your subscriber with a name and keep track of the subscriber's name.
2) Make sure that you have a shutdown hook to application to execute below procedure, which will unsubscribe and de-register the subscriber.
3) In case of unexpected shutdown/crash, when un-subscription could not be done, there must be a cleanup task to execute below code:
DECLARE
aqAgent SYS.AQ$_AGENT;
BEGIN
for idx in (select consumer_name from
DBA_QUEUE_SUBSCRIBERS a where a.queue_name = '<Your Oracle AQ Name>') loop
aqAgent := SYS.AQ$_AGENT(idx.consumer_name, NULL, NULL);
DBMS_AQADM.REMOVE_SUBSCRIBER('<Your Oracle AQ Name>', aqAgent);
end loop;
END;
This will make sure that your system remains full-proof.
Using MS Visual Studio 2008 C++ for Windows 32 (XP brand), I try to construct a POP3 client managed from a modeless dialog box.
Te first step is create a persistent object -say pop3- with all that Boost.asio stuff to do asynchronous connections, in the WM_INITDIALOG message of the dialog-box-procedure. Some like:
case WM_INITDIALOG:
return (iniPop3Dlg (hDlg, lParam));
Here we assume that iniPop3Dlg() create the pop3 heap object -say pointed out by pop3p-. Then connect with the remote server, and a session is initiated with the client’s id and password (USER and PASS commands). Here we assume that the server is in TRANSACTION state.
Then, in response to some user input, the dialog-box-procedure, call the appropriate function. Say:
case IDS_TOTAL: // get how many emails in the server
total (pop3p);
return FALSE;
case IDS_DETAIL: // get date, sender and subject for each email in the server
detail (pop3p);
return FALSE;
Note that total() uses the POP3’s STAT command to get how many emails in the server, while detail() uses two commands consecutively; first STAT to get the total and then a loop with the GET command to retrieve the content of each message.
As an aside: detail() and total() share the same subroutines -the STAT handle routine-, and when finished, both leaves the session as-is. That is, without closing the connection; the socket remains opened an the server in TRANSACTION state.
When any option is selected by the first time, the things run as expected, obtaining the desired results. But when making the second chance, the connection hangs.
A closer inspection show that the first time that the statement
socket_.get_io_service().run();
Is used, never ends.
Note that all asynchronous write and read routines uses the same io_service, and each routine uses socket_.get_io_service().reset() prior to any run()
Not also that all R/W operations also uses the same timer, who is reseted to zero wait after each operation is completed:
dTimer_.expires_from_now (boost::posix_time::seconds(0));
I suspect that the problem is in the io_service or in the timer, and the fact that subsequent executions occurs in a different load of the routine.
As a first approach to my problem, I hope that someone would bring some light in it, prior to a more detailed exposition of the -very few and simple- routines involved.
Have you looked at the asio examples and studied them? There are several asynchronous examples that should help you understand the basic control flow. Pay particular importance to the main event loop started by invoking io_service::run, it's important to understand control is not expected to return to the caller until the io_service has no more remaining work to do.
I'm working in a service whose main loop looks like this:
while (fServer.ServerState = ssStarted) and (Self.Terminated = false) do
begin
Self.ServiceThread.ProcessRequests(false);
ProcessFiles;
Sleep(3000);
end;
ProcessRequests is a lot like Application.ProcessMessages. I can't pass true to it because if I do then it blocks until a message is received from Windows, and ProcessFiles won't run, and it has to run continually. The Sleep is there to keep the CPU usage down.
This works just fine until I try to shut down the service from Windows's service management list. When I hit Stop, it sends a message and expects to get a response almost immediately, and if it's in the middle of that Sleep command, Windows will give me an error that the service didn't respond to the Stop command.
So what I need is to say "Sleep for 3000 or until you receive a message, whichever comes first." I'm sure there's an API for that, but I'm not sure what it is. Does anyone know?
This kind of stuff is hard to get right, so I usually start at the API documentation at MSDN.
The WaitForSingleObject documention specifically directs to MsgWaitForMultipleObjects for these kinds of situations:
Use caution when calling the wait
functions and code that directly or
indirectly creates windows. If a
thread creates any windows, it must
process messages. Message broadcasts
are sent to all windows in the system.
A thread that uses a wait function
with no time-out interval may cause
the system to become deadlocked. Two
examples of code that indirectly
creates windows are DDE and the
CoInitialize function. Therefore, if
you have a thread that creates
windows, use MsgWaitForMultipleObjects
or MsgWaitForMultipleObjectsEx, rather
than WaitForSingleObject.
In MsgWaitForMultipleObjects, you have a dwWakeMask parameter specifying on which queued messages to return, and a table describing the masks you can use.
Edit because of comment by Warren P:
If your main loop can be continued because of a ReadFileEx, WriteFileEx or QueueUserAPC, then you can use SleepEx.
--jeroen
MsgWaitForMultipleObjects() is the way to go, ie:
while (fServer.ServerState = ssStarted) and (not Self.Terminated) do
begin
ProcessFiles;
if MsgWaitForMultipleObjects(0, nil, FALSE, 3000, QS_ALLINPUT) = WAIT_OBJECT_0 then
Self.ServiceThread.ProcessRequests(false);
end;
If you want to call ProcessFiles() at 3 second intervals regardless of any messages arriving, then you can use a waitable timer for that, ie:
var
iDue: TLargeInteger;
hTimer: array[0..0] of THandle;
begin
iDue := -30000000; // 3 second relative interval, specified in nanoseconds
hTimer[0] := CreateWaitableTimer(nil, False, nil);
SetWaitableTimer(hTimer[0], iDue, 0, nil, nil, False);
while (fServer.ServerState = ssStarted) and (not Self.Terminated) do
begin
// using a timeout interval so the loop conditions can still be checked periodically
case MsgWaitForMultipleObjects(1, hTimer, False, 1000, QS_ALLINPUT) of
WAIT_OBJECT_0:
begin
ProcessFiles;
SetWaitableTimer(hTimer[0], iDue, 0, nil, nil, False);
end;
WAIT_OBJECT_0+1: Self.ServiceThread.ProcessRequests(false);
end;
end;
CancelWaitableTimer(hTimer[0]);
CloseHandle(hTimer[0]);
end;
Use a timer to run ProcessFiles instead of hacking it into main application loop. Then ProcessFiles will run in the interval you want and the messages will be processed correctly, not taking 100 % CPU.
I used a TTimer in a multithreaded application with strange results, so now i use Events.
while (fServer.ServerState = ssStarted) and (Self.Terminated = false) do
begin
Self.ServiceThread.ProcessRequests(false);
ProcessFiles;
if ExitEvent.WaitFor(3000) <> wrTimeout then
Exit;
end;
You create the event with
ExitEvent := TEvent.Create(nil, False, False, '');
Now the last thing is to fire the event in case of service stop. I think the Stop event of the service is the right place to put this.
ExitEvent.SetEvent;
I use this code for an cleanup thread in my DB connections pooling system, but it should work well in your case too.
You don't need to sleep for 3 full seconds to keep the CPU usage low. Even something like Sleep(500) should keep your usage pretty low (if there are no messages waiting to process it should blow through the loop pretty quick and hit the sleep again. If your loop takes a few ms to run it still means your thread is spending the vast majority of time in sleep.
That being said, your code may benefit from some refactoring. You say you don't want ProcessRequests to block waiting for a message? The only other thing in that loop is ProcessFiles. If that is dependent on the message being processed then why can't it block? And if it's not dependent on the message being processed then can it be split onto another thread? (the previous suggestion of firing ProcessFiles via a timer is an excellent suggestion on how to do this).
Use an TEvent that you signal when the thread should wake up. Then block on the tevent (using waitformultiple as Jeroen says if you have multiple events to wait on)
Is it not possible to move ProcessFiles to a seperate thread? In your MainThread you just wait for messages and when the service is being terminated you terminate the ProcessFiles thread.
I wrote a multi-threaded windows application where thread:
A – is a windows form that handles user interaction and process the data from B.
B – occasionally generates data and passes it two A.
A thread safe queue is used to pass the data from thread B to A. The enqueue and dequeue functions are guarded using a windows critical section objects.
If the queue is empty when the enqueue function is called, the function will use PostMessage to tell A that there is data in the queue. The function checks to make sure the call to PostMessage is executed successfully and repeatedly calls PostMessage if it is not successful (PostMessage has yet to fail).
This worked well for quite some time until one specific computer started to lose the occasional message. By lose I mean that, PostMessage returns successfully in B but A never receives the message. This causes the software to appear frozen.
I have already come up with a couple acceptable workarounds. I am interesting in knowing why windows is loosing these messages and why this is only happening on the one computer.
Here is the relevant portions of the code.
// Only called by B
procedure TSharedQueue.Enqueue(AItem: TSQItem);
var
B: boolean;
begin
EnterCriticalSection(FQueueLock);
if FCount > 0 then
begin
FLast.FNext := AItem;
FLast := AItem;
end
else
begin
FFirst := AItem;
FLast := AItem;
end;
if (FCount = 0) or (FCount mod 10 = 0) then // just in case a message is lost
repeat
B := PostMessage(FConsumer, SQ_HAS_DATA, 0, 0);
if not B then
Sleep(1000); // this line of code has never been reached
until B;
Inc(FCount);
LeaveCriticalSection(FQueueLock);
end;
// Only called by A
function TSharedQueue.Dequeue: TSQItem;
begin
EnterCriticalSection(FQueueLock);
if FCount > 0 then
begin
Result := FFirst;
FFirst := FFirst.FNext;
Result.FNext := nil;
Dec(FCount);
end
else
Result := nil;
LeaveCriticalSection(FQueueLock);
end;
// procedure called when SQ_HAS_DATA is received
procedure TfrmMonitor.SQHasData(var AMessage: TMessage);
var
Item: TSQItem;
begin
while FMessageQueue.Count > 0 do
begin
Item := FMessageQueue.Dequeue;
// use the Item somehow
end;
end;
Is FCount also protected by FQueueLock? If not, then your problem lies with FCount being incremented after the posted message is already processed.
Here's what might be happening:
B enters critical section
B calls PostMessage
A receives the message but doesn't do anything since FCount is 0
B increments FCount
B leaves critical section
A sits there like a duck
A quick remedy would be to increment FCount before calling PostMessage.
Keep in mind that things can happen quicker than one would expect (i.e. the message posted with PostMessage being caught and processed by another thread before you have a chance to increment FCount a few lines later), especially when you're in a true multi-threaded environment (multiple CPUs). That's why I asked earlier if the "problem machine" had multiple CPUs/cores.
An easy way to troubleshoot problems like these is to scaffold the code with additonal logging to log every time you enter a method, enter/leave a critical section etc. Then you can analyze the log to see the true order of events.
On a separate note, a nice little optimization that can be done in a producer/consumer scenario like this is to use two queues instead of one. When the consumer wakes up to process the full queue, you swap the full queue with an empty one and just lock/process the full queue while the new empty queue can be populated without the two threads trying to lock each other's queues. You'd still need some locking in the swapping of the two queues though.
If the queue is empty when the enqueue
function is called, the function will
use PostMessage to tell A that there
is data in the queue.
Are you locking the message queue before checking the queue size and issuing the PostMessage? You may be experiencing a race condition where you check the queue and find it non-empty when in fact A is processing the very last message and is about to go idle.
To see if you're in fact experiencing a race condition and not a problem with PostMessage, you could switch to using an event. The worker thread (A) would wait on the event instead of waiting for a message. B would simply set that event instead of posting a message.
This worked well for quite some time
until one specific computer started to
lose the occasional message.
By any chance, does the number of CPUs or cores that this specific computer have different than the others where you see no problem? Sometimes when you switch from a single-CPU machine to a machine with more than one physical CPU/core, new race conditions or deadlocks may arise.
Could there be a second instance unknowingly running and eating the messages, marking them as handled?