I'm trying to understand how to use the kFSEventStreamEventFlagEventIdsWrapped event flag with FSEvents.
According to the documentation, the flag is sent to registered instances when the event id counter wraps around, thus rendering previous event id obsolete.
Now let's imagine the following scenario:
I register for FSEvents in my application;
When done processing FSEvents (my application quits for instance), I save the last event id encountered while processing events to be able to replay changes from that id;
While my application is not running, the event id counter wraps around.
My question is: How am I supposed to know the counter wrapped around? (Thus requiring me to re-scan the whole directory structure.)
I now have an answer directly from Apple.
The scenario was wrong to begin with. When saving the last event id treated, you must also save with it the UUID of the event stream. An event id is valid only for a given event stream, which is identified by its UUID (see FSEventsCopyUUIDForDevice()).
Whenever the event id counter wraps around, a new event stream UUID is generated. So if you relaunch the app after the event id counter wrapped around, your stored last event id won’t be valid anymore, and you’ll know it as the event stream UUID won’t be the same.
If you encounter the kFSEventStreamEventFlagEventIdsWrapped flag, it means the counter wrapped around while your app was open. However, there’s nothing particular to be done. You should just be sure to get the new event stream UUID if you want to save the last event id.
EDIT:
Event IDs do not wrap.
Here is why: Suppose your machine generates 1 filesystem event per millisecond. This means it will generate ms_per_year=31536000000 filesystem events per year. So it will take more than 500 million years before the counter will wrap around the 64bit boundary.
>>> ms_per_year = 1000*60*60*24*365
>>> d64 = 2**64
>>> d64/ms_per_year
584942417L
If kFSEventStreamEventFlagEventIdsWrapped is set, it means the 64-bit event ID counter wrapped around. As a result, previously-issued event ID's are no longer valid arguments for the sinceWhen parameter of the FSEventStreamCreate() functions.[1]
Next time your should use kFSEventStreamEventIdSinceNow for FSEventStreamEventId and you must rescan all directory.
Related
I have a requirement as stated # https://kafka.apache.org/21/documentation/streams/developer-guide/dsl-api.html#window-final-results for waiting until window is closed in order to handle late out of order event by buffering it for duration of window.
Per my understanding of this feature is once windowing is created, the window works like wall clock processing, e.g. Creating for 1 hour window, The window starts ticking once first event comes. This 1hr window is closed exactly one hour later and all the events buffered so far will be forwarded to down stream. However, i need to be able to hold this window even longer say conditionally for as long as required e.g. based on state / information in external system such as database.
To be precise my requirement for event forwarding is (windows of 1 hour if external state record says it is good) or (hold for as long as required until external record says it's good and resume tracking of the event until the event make it fully 1hr, disregarding the time when external system is not good)
To elaborate this 2nd condition, e.g. if my window duration 1 1hr , my event starts at 00:00, if on 00:30 it is down and back normal on 00:45, the window should extend until 01:15.
Is it possible to pause and resume the forwarding of events conditionally based on my requirement above ?
Do I have to use transformation / processor and use value store manually to track the first processing time of my event and conditionally forwarding buffered events in punctuator ?
I appreciate all kind of work around and suggestion for this requirement.
the window works like wall clock processing
No. Kafka Streams work on event-time, hence, the timestamps as returned from the TimestampExtractor (by default the embedded record timestamp) are use to advance time.
To be precise my requirement for event forwarding is (windows of 1 hour if external state record says it is good)
This would need a custom solution IMHO.
or (hold for as long as required until external record says it's good and resume tracking of the event until the event make it fully 1hr, disregarding the time when external system is not good)
Not 100% if I understand this part.
Is it possible to pause and resume the forwarding of events conditionally based on my requirement above ?
No.
Do I have to use transformation / processor and use value store manually to track the first processing time of my event and conditionally forwarding buffered events in punctuator ?
I think this might be required.
Check out this blog post, that explains how suppress() work in details, and when it emits based on observed event-time: https://www.confluent.io/blog/kafka-streams-take-on-watermarks-and-triggers
I'm building a Kafka Streams application that generates change events by comparing every new calculated object with the last known object.
So for every message on the input topic, I update an object in a state store and every once in a while (using punctuate), I apply a calculation on this object and compare the result with the previous calculation result (coming from another state store).
To make sure this operation is consistent, I do the following after the punctuate triggers:
write a tuple to the state store
compare the two values, create change events and context.forward them. So the events go to the results topic.
swap the tuple by the new_value and write it to the state store
I use this tuple for scenario's where the application crashes or rebalances, so I can always send out the correct set of events before continuing.
Now, I noticed the resulting events are not always consistent, especially if the application frequently rebalances. It looks like in rare cases the Kafka Streams application emits events to the results topic, but the changelog topic is not up to date yet. In other words, I produced something to the results topic, but my changelog topic is not at the same state yet.
So, when I do a stateStore.put() and the method call returns successfully, are there any guarantees when it will be on the changelog topic?
Can I enforce a changelog flush? When I do context.commit(), when will that flush+commit happen?
To get complete consistency, you will need to enable processing.guarantee="exaclty_once" -- otherwise, with a potential error, you might get inconsistent results.
If you want to stay with "at_least_once", you might want to use a single store, and update the store after processing is done (ie, after calling forward()). This minimized the time window to get inconsistencies.
And yes, if you call context.commit(), before input topic offsets are committed, all stores will be flushed to disk, and all pending producer writes will also be flushed.
If a user is publishing to a tokbox session and for any reason that same user logs in on a different device or re-opens the session in another browser window I want to stop the 2nd one from publishing.
Luckily, on the metadata for the streams, I am saving the user id, so when there is a list of streams it's easy to see if an existing stream belongs to the user that is logged in.
When a publisher gets initialized the following happens:
Listen for session.on("streamCreated") when this happens, subscribe to new streams
Start publishing
The problem is, when the session gets initialized, there is no way to inspect the current streams of the session to see if this user is already publishing. We don't know what the streams are until the on("streamCreated") callback fires.
I have a hunch that there is an easy solution that I am missing. Any ideas?
I assume that when you said you save the user ID on the stream metadata, that means when you initialize the Publisher, you set the "name" property. That's a great technique.
My idea is slightly a hack, but its the best I can come up with right now. I would solve this problem by essentially breaking up the subscription of streams into 2 phases:
all streams created before this client connection
all streams created after
During #1 I would check each stream's "name" property to see if it belongs to the user at this client connection. If it does, then you know they are entering the session twice and you can set a flag (lets call it "userRejoining". In order to know that #1 is complete, I would set a timer (this is why I call it a hack) for a reasonable amount of time such as 1 second each time a "streamCreated" event arrives, and remove the any previous timer.
Then, if the "userRejoining" flag is not set, the Publisher is initialized and published to the session.
During #2, you just subscribe to any stream that is created.
The downside is that you've now delayed your user experience of publishing by ~1 second everywhere. In larger group scenarios this could be a deal breaker, but in smaller (1:1) types of sessions this should be acceptable. I hope this explanation is clear, and if not I can try to write some sample code for you.
Using event structures in LabView can get confusing, especially when mixing them with a mostly synchronous workflow. My question is, when an event structure exists in one frame of a sequence, how can I force it to ignore events (e.g. mousedown on a particular button) that were triggered while the workflow is in another frame of the sequence?
Currently, the event structures only process the events at the correct frame in the sequence, but if one was triggered while the workflow is in the previous frame, it processes those too and I want it to ignore any events that weren't triggered in the frame that the event structure exists within.
http://puu.sh/hwnoO/acdd4c011d.png
Here's part of my workflow. If the mousedown is triggered while the left part is executing, I want the event structure to ignore those events once the sequence reaches it.
Instead of placing the event structure inside your main program sequence, put it in a separate loop and have it pass the details of each event to the main sequence by means of a queue. Then you can discard the details of the events you don't want by flushing the queue at the appropriate point.
Alternatively you could use a boolean control to determine whether the event loop sends event details to the queue or discards them, and toggle the boolean with a local variable from the main sequence.
You can register for events dynamically. Registration is the point in time at which the event structure starts enqueueing events, and in your case this happens when the VI the event structure is in enters run mode (meaning it's executing or one of its callers is). You can change it so that you register using the Register for Events node and then you would only get events from that point on. When you unregister you will stop getting events.
There's a very good presentation by Jack Dunaway going into some details about events here.
You can find the code for it here.
In LabVIEW 2013 and later there are additional options for controlling the events queue, but I won't go into them here.
http://puu.sh/hwsBE/fe50dee671.png
I couldn't figure out how to flush the event queue for built-in event types like mousedown, but I managed to get around that by creating a static reference to the VI and setting the cursor to busy during the previous sequence, disabling clicking. Then when the sequence for the event structure is reached, I unset the cursor from busy, which re-enables clicking.
I have a class that implements a file-monitoring service to detect when a file I am interested in has been changed by something other than my application. I use the standard technique of opening the file (with the O_EVTONLY flag) and binding the file descriptor to a Grand Central Dispatch source of type DISPATCH_SOURCE_TYPE_VNODE. When I get an event, I notify my main thread with NSNotificationCenter's postNotificationName:object:userInfo: which calls an observer in my app delegate. So far so good. It works great. But, in general, if the triggering event is an attributes change (i.e. the DISPATCH_VNODE_ATTRIB flag is set on return from dispatch_source_get_data()) then I usually get two closely-spaced events. The behaviour is easily exhibited if I touch(1) the object I am monitoring. I hypothesise this is due to the file's mtime and atime being set non-atomically although I can't verify this. This can lead to spurious notifications being sent to my observer and this raises the possibility of race conditions etc.
What is the best way of dealing with this? I thought of storing a timestamp for the last event received and only sending a notification if the current event is later than this timestamp by some amount (a few tens of milliseconds?) Does this sound like a reasonable solution?
You can't ever escape the "race condition" in this situation, because the notification of your GCD event source in your process is not synchronous with the other process's modification of the underlying file. So, no matter what, you must always be tolerant of the possibility that the change you're being notified for could already be "gone."
As for coalescing, do whatever makes sense for your app. There are two obvious strategies. You can act immediately on a received event, and then drop subsequent events received in some time window on the floor, or you can delay every event for some time period during which you will drop other events for the same file on the floor. It really just depends on what's more important, acting quickly, or having a higher likelihood of a quiescent state (knowing that you can never be sure things are quiescent.)
The only thing I would add is to suggest that you do all your coalescence before dispatching anything to the main thread. The main thread has things like tracking loops, etc that will make it harder to get time-based coalescing right in certain cases.