How to check whether data is modified?

How to check whether data is modified? - algorithm

I have a web app, which allows users to edit several configurations. I want to make the users aware that they have made changed locally but not committed since their last load. When users change the configuration back, the "modified flag" should disappear.
What's a good approach to achieve it? Currently, I am thinking about keeping an original copy of the configurations, and compare current configuration with the original one each time the user made some changes, but I am worried about the performance since the configuration data is fairly large. (The app client runs in browsers.)
Please help, thanks.

I don't think performance will be an issue but if it were, I'd also create a "dirty field" map based on associative array or a hashmap/object with field names as keys. Of course all field names must be unique. This map will be initially empty.
When you edit any field, catch the onChange or onBlur event and compare just this field with the saved one. If they differ, put it to the dirty map like field_map['field1'] = true;. If they are equal, remove this key from the dirty map.
So, if your dirty map is not empty, you will know changes were made, and you will also know exactly what fields have changed.

Related

What does data look like when using Event Sourcing?

I'm trying to understand how Event Sourcing changes the data architecture of a service. I've been doing a lot of research, but I can't seem to understand how data is supposed to be properly stored with event sourcing.
Let's say I have a service that keeps track of vehicles transporting packages. The current non relational structure for the data model is that each document represents a vehicle, and has many fields representing origin location, destination location, types of packages, amount of packages, status of the vehicle, etc. Normally this gets queried for information to be read to the front end. When changes are made by the user, the appropriate changes are made to this document in order to update this.
With event sourcing, it seems that a snapshot of every event is stored, but there seem to be a few ways to interpret that:
The first is that the multiple versions of the document I described exist, each a new snapshot every time a change is made. Each event would create a new version of this document and alter it. This is the easiest way for me to wrap my head around it, but I believe this to be incorrect.
Another interpretation I have is that each event stores SPECIFIC information about what's been altered in the document. When the vehicle status changes from On Road to Available, for example, an event specifically for vehicle status changes is triggered. Let's say it's called VehicleStatusUpdatedEvent, and contains the Vehicle ID number, the new status, and the timestamp for this event. So this event is stored and is published to a messaging queue. When picked up from the queue, the appropriate changes are made to the current version of the document. I can understand this, but I think I still have some misconceptions here. My understanding is that event sourcing allows us to have a snapshot of data upon each change, so we can know what it looks like at any point. What I just described would keep a log of changes, but still only have one version of the file, as the events only contain specific pieces of the whole file.
Can someone describe how the data flow and architecture works with event sourcing? Using the vehicle data example I provided might help me frame it better. I feel that I am close to understanding this, but I am missing some fundamental pieces that I can't seem to understand by searching online.

The current non relational structure for the data model is that each document represents a vehicle
OK, let's start from there.
In the data model you've described, storage of a document destroys the earlier copy.
Now imagine that instead we were storing the the document in a git repository. Then then saving the document would also save metadata, and that metadata would include a pointer to the previous document.
Of course, we've probably got a lot of duplication in that case. So instead of storing the complete document every time, we'll store a patch document (think JSON Patch), and metadata pointing to the original patch.
Take that same idea again, but instead of storing generic patch documents, we use domain specific messages that describe what is going on in terms of the model.
That's what the data model of an event sourced entity looks like: a list of domain specific descriptions of document transformations.
When you need to reconstitute the current state, you start with a state you know (which could be the "null" state of the document before anything happened to it, and replay onto that document all of the patches (events) that have occurred since.
If you want to do a temporal query, the game is the same, you replay the events up to the point in time that you are interested in.
So essentially when referring to an older build, you reconstruct the document using the events, correct?
Yes, that's exactly right.
So is there still a "current status" document or is that considered bad practice?
"It depends". In the general case, there is no current status document; only the write-ordered list of events is "real", and everything else is derived from that.
Conversations about event sourcing often lead to consideration of dedicated message stores for managing persistence of those ordered lists, and it is common that the message stores do not also support document storage. So trying to keep a "current version" around would require commits to two different stores.
At this point, designers typically either decide that "recent version" is good enough, in which case they build eventually consistent representations of documents outside of the transaction boundary... OR they decide current version is important, and look into storage solutions that support storing the current version in the same transaction as the events (ex: using an RDBMS).
what is the procedure used to generate the snapshot you want using the events?
IF you want to generate a snapshot, then you'll normally end up using a pattern called a projection, to iterate over the events and either fold or reduce them to create the document.
Roughly, you have a function somewhere that looks like
document-with-meta-data = projection(event-history-with-metadata)

Event sourcing - delete event related files

I'd like to store some of my data in relative big files (a few GBs per file). I'd like to use event sourcing and save events related to these files, e.g. FileCreated: title, description, timestamp, author, personal, encryptionkey, etc. After a while some of the files won't be needed any longer, and they take up a lot of space. So in order to free up space, I need to delete them. Doing so is problematic, because I will have the history in the event storage, but not the file in the filesystem. Is there any way to keep integrity and somehow delete both? Or is there a best practice for this problem?

Since I did not get an answer, I try to answer this myself.
It is possible to remove an event from the history, you need to create a new event storage and filter the events for the same aggregate id you want to get rid of. After you are done, you can switch to the new event storage and remove the old one. Probably you need to replay projections as well. So it is very similar to a whole migration, it takes a lot of time. In the current case it is not problem if I need to do this only once every year or so. Another problem with storing this data in the event storage that either I stream it from there or I need to duplicate it in order to serve it. The latter one is not always a good solution, because sometimes it takes too much time to copy and in order to save the data you need to stream it anyways, otherwise you will be out of memory very fast. So the event storage should support streaming attachments.
Another solution to keep the relative big data in the files and display something like 404 not found, or file was removed because this and that. I see this frequently. In this case it is ok to keep the event in the storage and for example you can add a ContentRemoved event, where you can select the cause. Another option to hide the removed file, so it won't be listed by the app, this is usual I guess too. This solution has drawbacks too. Migration is more complex with this approach, because you need to move both the event storage and the files. If you remove a file by accident, you cannot undo it later unless you have the file in the backup. This can be fixed by delaying the actual file removal with a few days, so you can undo it if you change your mind. Another option to make a trash and files will be deleted only by emptying the trash.
I think both solutions are worth to consider and probably it depends on the actual project which one is better suited.

encodeRestorableState for unsaved documents

The documentation for NSDocument states:
Subclasses can override this method and use it to restore any
information that would be needed to restore the document’s window to
its current state. For example, you could use this method to record
references to the data currently managed by the document and displayed
by the window. (Do not store the actual data itself. Store only
references to the data so that you can load it later from disk.) You
must store enough data to reconfigure the document and its window to
their current state during a subsequent launch of the app.
What does "Do not store the actual data itself." actually mean? Is this a hard and fast rule? Or is it more of a guideline?
In particular, I'm wondering about the case of documents with unsaved changes in them. Is it "permissible" to store the unsaved changes (which may be everything if this is a new document)? Or, do I need to save the data off in a file somewhere... and if so, where is the preferred location?
I don't want to restore a bunch of identical (blank) documents if I had multiple unsaved new documents when the application was shut down.
Thanks for any hints on the proper way to handle this.

Never mind. It hit me in the shower this morning (where I make most of my tech breakthroughs).
I am pretty sure now that the key is to get autosaving working with my application.

Core Data / NSTextView breaks only after save

We have an NSTextView and some data saved about its contents in a core data Managed object context. Everything works great while the managed object context stays in memory. However when we save it, we get very weird fetch request behaviors.
For example, we run a fetch request that asks for all elements with a textLocation less than or equal to 15. The first object in the array we get back has a textLocation of 16.
I know I can't get a definitive answer here, as the code is fairly complex. But does anyone know what this issue smells of?
My thought is that we are somehow not getting the proper MOC synced with the NSTextView after saving? What could change that breaks this?
Thanks.

For example, we run a fetch request
that asks for all elements with a
textLocation less than or equal to 15.
The first object in the array we get
back has a textLocation of 16.
Really, the only way to get that is to (in reverse order of likelihood):
Mess up ethe definition of the attribute such that you think your are saving one type of numerical info but that you are actually saving another.
You've mangled the predicate so that it actually looks for values of 16 or greater. (You can test predicates against an array of dictionaries whose keys have the same names as you Core Data entities.)
It's an error in the conversion between a number and a string for purposes of displaying in the UI or logging.
I would start with (3) myself because it seems more common and until you confirm you don't have a display problem, you can't diagnose the other problems.

I finally managed to work out what was going on. I was setting textLocation using the setPrimitiveValue... just because I didn't want notifications to fire off. Turns out that's a really bad idea, because Core Data didn't know the value had changed. It still thought the value was 15 instead of 16.
Let this be a lesson: never bypass KVO unless you're INSIDE the managed object and you know what you're doing!

In MS CRM 4.0, is there a way to ensure that a field is unique?

I am creating a new entity in CRM 4.0 to track our projects. One field is a Project Code, and I'd like to have a way to ensure that this field contains a unique value.
I understand that this is not a key, and it won't be used as a key, but for human readability/tracking purposes, it would be nice if I could tell the user that the code he just entered has already been used.
I am thinking that a webservice/javascript call will be necessary, but I wanted to see if anyone else has tackled this issue already.

Depends on how foolproof you want it to be.
The web service call is pretty lightweight, but if two people save a record at the same time, it's not going to detect it at that time, and dupe codes will happen.
A custom plugin would definitely detect dupe codes, but you don't get any feedback until after the user attempts to save. There's also still a small chance there could be repeat codes from users entering records.
The completely bulletproof way we've used is to have a plugin that checks a custom database table that we lock and then only let one plugin at a time through.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio