Get created/modified/deleted files by a specific process from an event tracing (ETW) session - windows

I've been searching for a solution to get all created/modified and deleted files by a specific process from an event trace (ETW) session (I will process data from an etl file not from a real-time session).
Apparently the simplest solution to get this done was to get the FileCreate and FileDelete events from FileIo_Name class and map them to the corresponding DiskIo_TypeGroup1 events. However, this solution isn't working for me since I don't receive any DiskIo_TypeGroup1 events for the corresponding FileDelete events, so I can not get the process ID. Also not all FileCreate events have an associated DiskIo_TypeGroup1 event (I think this happens for the empty created files or only for the opened files).
Note: I need DiskIo_TypeGroup1 mapping because FileIo_Name events don't have the ThreadId and ProcessId members populated - they are set to (ULONG)-1. Also, I can not decide which files where just opened or modified without knowing the "file write size". DiskIo_TypeGroup1 also don't have the ThreadId and ProcessId (in event header, on newer OS's) members populated, but it has the IssuingThreadId structure member from which I can obtain the ProcessId mapping to Thread_TypeGroup1 class events.
So I investigated how the FileIo_Create class can help me, and remarked that I can get the CreateOptions member which can have the following flags: (FILE_SUPERSEDE, FILE_CREATE, FILE_OPEN, FILE_OPEN_IF, FILE_OVERWRITE, FILE_OVERWRITE_IF). But the initial problem still persists. How can I check if a file was created from scratch instead of being just opened (e.g. in case of FILE_SUPERSEDE)?
Maybe I can use the FileIo_ReadWrite class to get Write event. Like using the DiskIo_TypeGroup1 class. So, if something was written to a file, then can I suppose that the file was either created or modified?
To find the deleted files I think that the FileIo_Info class and Delete event are the solution. Guess that I can receive Delete events and map them to FileIo_Name to get the file names.
Note: The FileIo_Create, FileIo_Info, FileIo_ReadWrite contain information about process id.
Are my suppositions right? What will be the best solution for my problem?

I will share my implemented solution as follow :
Created Files:
I have stored all FileIo_Create events as a pending create operation and waited to receive associated FileIo_OpEnd to decide if the file was opened, created, overwritten, or superseded from the ExtraInfo structure member.
Modified Files:
I marked files as dirty for every Write event from FileIo_ReadWrite and every SetInfo event with InfoClass->FileEndOfFileInformation and InfoClass->FileValidDataLengthInformation from FileIo_Info. Finally on Cleanup event from FileIo_SimpleOp verify if the file was marked as dirty and store as modified.
Deleted files:
I marked the files as deleted if was opened with the CreateOptions->FILE_DELETE_ON_CLOSE flag from FileIo_Create or if a Delete event from FileIo_Info appears. Finally on Cleanup event from FileIo_SimpleOp stored the file as deleted.
Also the process id and file name was obtained from the FileIo_Create events, more precisely from OpenPath structure member and ProcessId event header member.

Related

Do I need to store last state of object in separate table in Event Sourcing

I'm still learning event sourcing i dont undestand something.
When i get a command to change object, do I first recreate that object from event store than change it and save event, or should i have separate table that holds last state?
What is practice here?
I'm still learning event sourcing i don't understand something. When i get a command to change object, do I first recreate that object from event store than change it and save event, or should i have separate table that holds last state? What is practice here?
The first rule of optimization: Don't.
For handling commands, all of the information that you need to have is stored in your event history; simply loading the history and recomputing any state you need will get the job done.
In the case where you need low latency in your command handler, AND recomputing the state you need from the event history is too slow to meet your service level targets, then you might look into saving a "snapshot", and using that to speed up the load of your data.
Current consensus is that snapshots should be saved separately from the event history (ie: a snapshot is not another kind of event), as though the snapshot were another "read model".

Running Laravel Jobs/Handling Events in Order Received

I have an endpoint: /webhook/events
I'll receive events from a 3rd-party rapidly to this endpoint like below that contains the order information.
Order: Update (4s to process)
Order: Delete (3s to process)
What happens is that when I receive these events, I check to see if the order exists. If the order exists in our DB and it's been soft-deleted, I restore it. In the 3rd-party application, an order can be deleted and restored so I have to reflect this functionality.
My problem is that I may get both of the events above within milliseconds of each other, so in this case, the Update Order/Delete event finishes followed by the Update Order event. This soft-deletes the order, then restores it.
I need to make sure that the events run consecutively instead of racing, but I'm not sure how.

How to retrieve EVERY mail message (including Deleted, Archived, and Recoverable items) for an O365 user with Graph API?

We are working on an application in the compliance/monitoring space where we are monitoring the activity of an individual. Because of this, we want to pull EVERYTHING in a user's Office 365 mailbox - if it has text the user wrote or received, we want it if it is there, even if it was deleted, purged, etc.
We are using the Graph API and have an existing implementation that uses the standard "messages" GET command:
GET https://graph.microsoft.com/v1.0/me/messages
We are making use of the GraphApiClient (Microsoft.Graph v1.9.0), so the code actually looks like this:
IUserMessagesCollectionPage pageOfMessages = _graphClient.Users[userId].Messages.Request(options).Top(batchSize).Expand("attachments").GetAsync().Result;
However, at the very least this does not return any items from any of the RecoverableItems folders. After looking into it, I am now suspicious that there might be other folders that are not returned by this command either. There is quite the list of Well-known folder names and I'm not sure what others might not be included in a generic Messages request.
Based on this post, I know you can request the messages in the missing folders by WellKnownFolderName one at a time like this:
GET https://graph.microsoft.com/v1.0/me/MailFolders/RecoverableItemsDeletions/messages
It even works with the GraphApiClient:
IMailFolderMessagesCollectionPage pageOfMessages = _graphClient.Users[userId].MailFolders["RecoverableItemsDeletions"].Messages.Request(options).Top(batchSize).Expand("attachments").GetAsync().Result;
The problems with this are:
I don't know how to build a comprehensive list of every folder that has messages for the user
Some of the folders (like RecoverableItemsDeletions and ArchiveRecoverableItemsDeletions, for example) can contain duplicates so I would need to use a dictionary to get rid of the duplicates
It would be a lot more expensive to first build a list of relevant folders and then request their contents and their childrens' contents one request at a time.
At scale, a folder-by-folder implementation could be subject to throttling (if we are monitoring enough users with big enough mailboxes)
Does anyone know the best way to do this? Thanks!

Is there a way to uniquely identify a picture attached to an Outlook ContactItem?

It is my understanding that an Outlook contact's "avatar" image is stored as an Attachment object in the Attachments collection (ref).
Now suppose, as an example, that I want to keep my own (separate) contact database updated whenever the user's Outlook contacts change, so I'm registered for a PropertyChange event on the ContactItem. Is there any way to determine whether or not the picture attached to a ContactItem has changed, or do I need to call SaveAsFile() on the ContactPicture.jpg Attachment every time that I get a change notification, just on the off chance that it may have been updated?
There is no any kind of CRC of the attachment data, so you won't know if the actual binary data has changed. You can read the Attachment.Size property, and if it is different from what you had before, the data has changed for sure.
You can also read the PR_CREATION_TIME and PR_LAST_MODIFICATION_TIME properties using Attachment.PropertyAccessor.GetProperty, but these properties are not requires and can stay the same even if the data has changed.

How to use kFSEventStreamEventFlagEventIdsWrapped with FSEvents?

I'm trying to understand how to use the kFSEventStreamEventFlagEventIdsWrapped event flag with FSEvents.
According to the documentation, the flag is sent to registered instances when the event id counter wraps around, thus rendering previous event id obsolete.
Now let's imagine the following scenario:
I register for FSEvents in my application;
When done processing FSEvents (my application quits for instance), I save the last event id encountered while processing events to be able to replay changes from that id;
While my application is not running, the event id counter wraps around.
My question is: How am I supposed to know the counter wrapped around? (Thus requiring me to re-scan the whole directory structure.)
I now have an answer directly from Apple.
The scenario was wrong to begin with. When saving the last event id treated, you must also save with it the UUID of the event stream. An event id is valid only for a given event stream, which is identified by its UUID (see FSEventsCopyUUIDForDevice()).
Whenever the event id counter wraps around, a new event stream UUID is generated. So if you relaunch the app after the event id counter wrapped around, your stored last event id won’t be valid anymore, and you’ll know it as the event stream UUID won’t be the same.
If you encounter the kFSEventStreamEventFlagEventIdsWrapped flag, it means the counter wrapped around while your app was open. However, there’s nothing particular to be done. You should just be sure to get the new event stream UUID if you want to save the last event id.
EDIT:
Event IDs do not wrap.
Here is why: Suppose your machine generates 1 filesystem event per millisecond. This means it will generate ms_per_year=31536000000 filesystem events per year. So it will take more than 500 million years before the counter will wrap around the 64bit boundary.
>>> ms_per_year = 1000*60*60*24*365
>>> d64 = 2**64
>>> d64/ms_per_year
584942417L
If kFSEventStreamEventFlagEventIdsWrapped is set, it means the 64-bit event ID counter wrapped around. As a result, previously-issued event ID's are no longer valid arguments for the sinceWhen parameter of the FSEventStreamCreate() functions.[1]
Next time your should use kFSEventStreamEventIdSinceNow for FSEventStreamEventId and you must rescan all directory.

Resources