Design a share, re-share functionality for a website, avoiding duplication - algorithm

This is an interesting interview question that I found somewhere. To elaborate more:
You are expected to design classes and data structures for some website such as facebook or linkedin where your activity can be shared and re-shared. Design should be such that it avoids redundancy and duplication.
While thinking of this problem I was stuck on "link vs copy" problem as discussed here
But since the problem states that duplication should be avoided I decided to go "link" way. This makes sharing/re-sharing easier but deleting very difficult. i.e. if the original user deletes their post all the shares should be deleted. (programmatically speaking all the objects on the pointing to the particular activity should be made null. And this is the difficult part here, i.e. to find all the pointing objects)

Wouldn't it be better to keep the shares? The original user deletes
their post, fine, it's gone. But everyone who has linked to it should
not suddenly have it disappear on them.
This could be done the way Unix handles hard links. "Deleting" just
means removing one link to an object -- an inode, in Unix terms. You
don't remove the object itself until the link count is zero.

It's not obvious from the original specification that deletion should work as you describe. It might be desired that when the original user deletes the item, it is not deleted elsewhere; in that case you don't necessarily need to track all references, just keep a reference count on each post, and remove it from the database only when the count hits zero.
If you do want the behavior you describe, it may be achievable by simply removing broken links as and when you encounter them, again relieving you of the need to track each reference. The cost of tracking and updating every reference to every post is replaced with the comparable cost of one failed lookup for each referring page. The latter case is simpler to implement, though, and the cost doesn't hit your server all at once.
In real life, I would implement all references as bidirectional anyway, because it's likely to be needed sooner or later as you add features. For example, a "like" counter seems pretty simple, but to prevent duplicate votes you need to keep track of who has liked each item, and then if you want to remove their "like" when they delete their profile, you need to keep a list of each user's outbound "likes" too.
It takes a lot of database activity to implement something like Facebook...

Related

How are conflicts handled while synchronizing data in consumer-facing applications?

Let us imagine we have an object D, containing some data. This is modified differently across two different locations, giving rise to data objects D1 and D2. Depending upon the contents, D1 and D2 may be in conflict with each other when being merged back as part of a synchronization process.
Systems such as version control systems simply point out that the two data objects are in conflict with each other and leave it upon the user to manually resolve the conflict.
However, let us now imagine a consumer-facing application, such as a note-taking application that synchronizes contents online. In this case, no user will want to manually resolve conflicts that may have arisen due to the user typing out two versions of the same note with different contents. Discarding the older object for the newer object isn't possible either, since there may be valuable content in the older object that the user wants.
How should I go about resolving such conflicts in a consumer-facing application?
Well, if you don't want manual conflict resolution, then you will have to automatically merge changes from both updates.
There is no way that works well for all applications. When you have a requirement like this, you have to carefully design the application so that automatic merging makes sense.
There are a few common approaches, and you can do one or all of them in various combinations:
1) Merge updates really fast. Think google docs -- updates are merged in real time as people edit. Operational Transformation (https://en.wikipedia.org/wiki/Operational_transformation) is a good way to understand exactly how to do that kind of merging, but it doesn't have to be as complicated as that doc. The reason this works well is that updates are small and you can tell if someone is messing with your stuff before you put a lot of work into it. Politeness fixes conflicts -- one of you will wait until the other is done with that stuff.
2) Locking. If you click the edit button on a note, make lock it so that nobody else can edit it until you're done, etc. This is old-school, and not nearly as slick as (1), but it can work in situations where you can't merge fast enough to do (1).
3) Design your data model and interface to make merged versions as nice as possible. If anyone can add notes, but a note can only be edited by its owner, then no problem, for example. Or maybe you can only edit my stuff if you ask permission first and I give it to you. As things get more complicated than that, this becomes increasingly difficult. It's not usually possible to do this well if you're not willing to make sacrifices in application functionality. You've got one thing on your side, though: It's rude to mess with someone else's work, so a lot of the things you can do look like you did them just to enforce good behavior, and users will thank you for them if you did it with finesse.

How to keep your distributed cache clean?

In a N-Tier architecture, what would be the best patterns to use so that you can keep your cache clean?
I know it's easy to just set an absolute/sliding timeout, but is there a better mechanism available to allow you to mark your cache as dirty after you update the underlying persistence.
The difficulty I"m trying to wrap my head around is that Cache are usually stored as KVP. But a query is usually a fair bit more complex than that. So how can the gateway service tell the cache store that for such and such query, it needs to refetch from persistence.
I also can't afford to hand-code the cache update per query. I'm looking for a more systematic approach.
Is this just a pipe dream, or is there some way to do this elegantly?
Link/Guide/Post appreciated.
I have worked with AppFabric and I think tried to do what you are asking about. I was working on an auction site and I wanted to pro-actively invalidate items in the cache.
For example, we had listings (things for sale) and they would be present all over the cache (AppFabric). The data that represented a listing was in 10 different places. What I initially wanted was a way to say, "Ok, my listing has changed. Let me go find everywhere it exists in cache, and then update." (I think you say "mark as dirty" in your question)
I found doing this was incredibly difficult. There are tags in AppFabric that I tried to use, so I would mark a given object (or collection of objects) with a tag and that would let me query the cache and remove items. In other words, if an object had a LISTING tag, I would find it and invalidate it.
Eventually I settled on a two-pronged attack.
For 95% of the data I let it expire. It was a happy day when I decided this because everything got much easier to develop. I had to make some concessions in the UI etc., but it was well worth it.
For the last 5% of the data I resolved to only ever store it once. For example, a bid on a listing. Whenever a new bid came in, we'd pro-actively invalidate that object, and then everything that needed that information would be updated as well.

Implementation of Unsaved Changes Detection

It seems like three ways to approach detecting unsaved changes in a text/image/data file might be to:
Update a boolean flag every time the user makes a change or saves, which would result in a lot of unnecessary updates.
Keep a cached copy of the original file and diff the two every time a save operation needs to be checked.
Keep a stack of all past operations and push/pop operations as needed, resulting in a lot of extra memory usage.
In general, how do commercial applications detect whether unsaved changes exist and what are the advantages/disadvantages of each approach? I ran into this issue while writing a custom application that has special saving behavior and would like to know if there is a known best practice.
As long as you need an undo/redo system, you need that stack of past operations. To detect in wich state the document is, an item of the stack is set to be the 'saved state'. Current stack node is not that item, the document is changed.
You can see an example of this in Qt QUndoStack( http://doc.qt.nokia.com/stable/qundostack.html ) and its isClean() and setClean()
For proposition 1, updating a boolean is not something problematic and take little time.
This depends on the features you want and the size/format of the files, I guess.
The first option is the simplest, and it gives you just what you want with minial overhead.
The second option has the advantage that you can detect when changes have been manually reverted, so that there is no real change after all (although that probably doesn't happen all too often). On the other hand it is much more costly to make a diff just to check if anything was modified. You probably don't want to do that everytime the user presses a key.
The third option gives the ability to provide an undo-history. You could limit the number of items in that history by grouping changes together that were made consecutively (without moving the cursor in between), or something like that.

UX question: is better to have "serious delete" or have "trash"

I am developing an application that allows for a user to manage some individual data points. One of the things that my users will want to do is "delete" but what should that mean?
For a web application is it better to present a user with the option to have serious delete or to use a "trash" system?
Under "serious delete" (would love to know if there is a better name for this...) you click "delete" and then the user is warned "this is final and tragic action. Once you do this you will not be able to get -insert data point name here- back, even if you are crying..."
Then if they click delete... well it truly is gone forever.
Under the "trash" model, you never trust that the user really wants to delete... instead you remove the data point from the "main display" and put into a bucket called "the trash". This gets it out of the users way, which is what they usually want, but they can get it back if they make a mistake. Obviously this is the way most operating systems have gone.
The advantages of "serious delete" are:
Easy to implement
Easy to explain to users
The disadvantages of "serious delete" are:
it can be tragically final
sometimes, cats walk on keyboards
The advantages of the "trash" system are:
user is safe from themselves
bulk methods like "delete a bunch at once" make more sense
saves support headaches
The disadvantages of the "trash" system are":
For sensitive data, you create an illusion of destruction users think something is gone, but it is not.
Lots of subtle distinctions make implementation more difficult
Do you "eventually" delete the contents of the trash?
My question is which one is the right design pattern for modern web applications? How does an "archive" function work into this? That is how gmail works. Give enough discussion to justify your answer... Would love to be pointed towards some relevant research.
-FT
Here is a relevant article you might like. It's more focused on business scenarios, but it can apply anywhere.
Don't Delete, Just Don't.
In my personal idiosyncratic view, every irreversible action is a serious design error. These “Are you REALLY, ABSOLUTELY, POSITIVELY sure?” message boxes are pure crap because the user is very quickly conditioned to just click “OK” and be done with it. In fact, I have lost important data despite such dialogs on several occasions.
In other words: these dialog boxes do not add a working fail-safe barrier, just a usability barrier.
Even trash systems have this problem (they just postpone the moment); the best solution from a usability point of view is an infinite history. Of course, implementing this may carry forbidding costs (e.g. in terms of memory usage).
Permanently deleting sensitive data can (and should!) by all means be implemented as a much more complicated action: it is rarely needed. The only issue is to make it clear to the user that data is normally not lost. Using a history instead of the trash can may help here: the trash can could be misunderstood as being a permanent delete while a (visible) history of actions doesn’t give this illusion of safety.
I think you pretty well summarized the pros and cons for the trash model:
Pro: Overall, better for the user.
Con: Overall, easier for the developer.
For a usability specialist like myself, it’s a no brainer. As far as I’m concerned developers are supposed to work hard to make life easier for users. Frankly, until web apps have Undo capabilities like trash, I don’t think we can consider them usable. Time that web apps catch up to 1984.
A few other details:
Another advantage of the trash model is the capacity to eliminate the need for confirmation message in most cases. The vast majority of the time, the user is deleting something intentionally, so putting an extra step in with a confirmation message is usually adding work for them. Worse, they get in the habit of smacking OK quickly, which generalizes to other confirmations they really better read. See http://alistapart.com/articles/neveruseawarning.
Part of the beauty of the trash metaphor is that it suggests to user that deleting is not permanent. Like a physical trash can, objects can be retrieved. If you feature the trash can obviously enough on your page (so users realize they can open it), and include an image of the trash can as an icon next to the Delete menu item, that should be enough to indicate that deleting doesn’t destroy, but rather moves things to the trash.
It’s probably acceptable to destroy the oldest deletions in the trash if there are performance considerations. That is once again consistent with the trash can metaphor: users can anticipate that sooner or later “someone” is going to empty the trash. I don’t think users will care if the trash never empties. If they need to truly destroy something sensitive, you can provide an explicit procedure for it (e.g., deleting something that’s already in the trash).
My opinion is that there really is no reason to hard-delete anything that could be of importance to someone. In my experience, the pain of having to do a tape backup because a user realized they deleted something last week that they needed is easily resolved by using an 'IsActive' or 'IsDeleted' flag on records that a user could soft-delete.
Totally context-dependent.
If it's an easily reversible action, just delete it, you might not even need a confirmation
If you might be losing lots of data, a "temporary deleted items storage" might well be in place
I don't see that the presentation of the application (Web App or not) is a major driver for this decision. I would be more concerned with the value of the data, its ease of recreation and the costs of keeping and housekeeping it.
My feeling is that when your app is acting as a respository of valuable data then delete-via-trash is preferable.

UI hints that prevent user errors [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What UI/GUI guidelines should be followed that subtly (or not so subtly) direct users so they don't shoot themselves in the foot.
For instance, you might want to give power users the ability to "clean" a database of infrequently used records, but you don't want a new user to try out that option if they've just spent hours entering new records - they may lose them all because they're 'infrequently used'. Please don't address this specific issue - it's just here to clarify the question.
While one could code a bunch of business logic in place to prevent some issues, you can't account for everything a user might do.
What are some common techniques, tips, and tricks that prevent improper usage?
ie, How should I design the interface to alert users that a function or action is to be taken with care
What should I design in that limits risk and exposure if a poor action is taken?
-Adam
Everything can be undone. Don't erase - deactivate. Back up before every destructive operation, and give the user a way to restore.
That's the path. It's hard to follow it all the way, but it's what you're aiming for.
Make it possible to undo dangerous actions.
If it's a reasonably big application or system, require separate admin access for dangerous operations as well.
Don't Make Me Think
And you can, in fact you HAVE to account for everything they might do. Because you (as the designer) as the one who gives them the ability to do all those things.
Before putting ANY item on a gui as yourself "Can this be misused?" and if it can, you might want to go with a lower level of customizability.
Example hierarchy
Button - Can be clicked.
T/F Radio Button (mandatory) - Only two options.
Combo Box - Many options, possibly "no option". more confusing.
Text field - Myriad of wildly inconsistent options. More confusing for user, more dangerous for coder.
Basically, if the user doesn't need extra options, then don't give them extra options. You'll only confuse them.
This is an old article, but it's still a great one:
Microsoft Inductive User Interface Guidelines
Never rely on anything that says "Are you sure?" The user is ALWAYS sure and that's if they even bothered to read it before dismissing.
Partition your users and have fine-grained permissions.
Define some power-user permissions that enable the "more dangerous" operations.
Power user permission is not given out lightly -- only to actual power users -- and revoked readily.
I'm of the school of thought that, in case of inclarities or ambiguousity, the user is rarely wrong, and the UI is always to blame. So, when you say "punch the user in the face", tagged with "pebkac", I'm thinking that you would do good with a slap in the face.
Unfortunately, I'm unable to give any good UX-advice, since I'm a mere programmer, and therefore more or less by definition disqualify as a good UI-designer. I'd like just to point out the possibility, that you actually could be the one who needs to get a clue, and try to be more humble towards the users.
Edit for Adam:
The little I know about UX is how little I know about it. It's an entire career path. I know for a fact that there's very little anyone can learn by asking a single make-me-good-at-this question at Stack Overflow. It's like me asking "help me write better code", with the body text formulated as a story of how my colleagues ridicule me of my code.
We, programmers, are engineers. We like order and reason and logical decisions. But the average user is not a programmer, not an engineer and, in many cases, not interested in computers themselves the very least bit.
I'm glad that people are giving you nuggets of good advice, and I'm glad that you, contrary to my first impression (I'm sorry about that), are eager to take those bits and understand the needs of the user.
But the point remains: You need to buy books (Don't Make Me Think is a great place to start, as already recommended). You need to watch how people use your software. You need to observe where they stumble, and jig things around until your UIs seem natural.
I'm sorry I still can't give you an answer. Because I don't have it. And even if I would have it, I would probably have to charge you 50EUR an hour, for years into the future.
Make the results of the user's action visible and offer a way to undo those changes.
When the changes are visible, then the user gets feedback of whether the results were what he intended to do, and if they are not, then the possibility to undo will let the user to try again to reach his goal. If possible, make the results of the action visible before the user invokes the action (for example, when dragging some element, show what would happen if the user would release the mouse button, for example visualize addition of the element where it will be moved to and visualize the removal of the element from where it was moved from).
There are a couple of types of undo. The most simple is a single-step undo (as in Notepad), but it is often not enough. Better is a multi-step undo (as in Word), which covers most of the cases, but does not allow undoing a specific action without undoing all the actions that have been done after it. That can be solved by object-specific undo, for example in a form with many fields (or cells in a grid like in Excel), right-clicking the field would show a list of previous values in that field. For deleted data you could have a store of deleted data, from where the user can restore things after deleting them (for example if the user deletes a slide in Powerpoint). And finally you could have a full version history of every change, for example as Local History works in IntelliJ IDEA - make a history entry every time the file is saved (and save everything automatically after a couple seconds of inactivity).
Confirmation dialogs don't help. The user might read it the first time, but soon after that clicking "OK" in the dialog becomes an automated process, and the user will press Enter before the dialog even shows up. Then the confirmation dialog has become just a source of unnecessary mechanical work. The user is always sure about doing some action, even when he is wrong - otherwise he would not have done that action.
Well there are a few different ways that I can/do go about these types of things.
User documentation - first and foremost give them some documentation to work with, and make the systems easy for them to use. Just general usability and descriptive names/actions for everything.
Provide confirmation screens with warnings. Full disclosure of what the action is going to do, with the warnings inside of a yellow box. It draws attention to it and helps prevent the need for the other items.
Have a roll-back plan. For large risky operations you can either simply set "deleted" flags, or offload the data to a temporary "recycle bin" of sorts should they accidentally remove/modify data that was unintended.
Require multiple approvals, for data purge operations especially go to a two-tiered approach, requiring approval from separate users.
These are just a few of the ideas that I have.
Two things immediately come to mind.
The first is the notion of progressive disclosure, i.e., only show users what they need in order to accomplish the task at hand. How many UIs have we seen that have hundreds of controls on a single dialog? Divide the controls into their respective tasks and only allow the user to do a single task at a time. An Advanced button on a dialog is one way to implement this, and this concept has the added benefit of separating the power users from the run-of-the-mill users. Run-of-the-mill users are less likely to attempt a task that is likely to be beyond their skill level.
The second is to leverage the wizard concept for complicated tasks. I know wizards have fallen out of style, but if a task is truly complicated, users usually appreciate having their hands held the first few times. A good example of this is the WinZip wizard interface. If you've never zipped a file before, this wizard uses a logical progression to walk you through the process. And then, once you've grown comfortable with it, you can switch to the classic interface to zip files more quickly.
Of course to do all of this requires a committment not only by the developers, but by management. And that, sadly, is where many of these usability battles are lost.

Resources