In starteam, how can I find out when a file was deleted, and by whom? - starteam

We have a small team running StarTeam. A constant source of frustration and problems is the handling of deleted files in StarTeam. It is obvious that Starteam keeps track of deleted files internally, but it does not seem to be possible to get any information about a file deletion.
So far, my only solution to find the timing of a delete is to perform a manual binary search using the 'compare' views. Is there any better way (the query for 'delete time' never seems to pick up any files).

The Audit tab (just to the right of File, ChangeRequest, etc.) is probably your best bet if you're just looking for who deleted what and when. The Audit tab also provides information about when items and folders were created, shared, or moved, as well as when View labels are attached/detached. Whenever someone has files unexpectedly appear or disappear, I direct them to the Audit tab first.
There is a server-side configuration setting for the length of time the audit data is retained (30 days by default, I believe). Since it is not retained forever, it isn't a good option for historical data. The number of audits can be quite large in active views.
If you're looking for something more than that or older than your audit retention time, go with Bubbafat's suggestion of the SDK and getDeletedTime/getDeletedUserID.

Comparing views (or rolling back a view to see the item again) is the only way I know how to do this in StarTeam without writing code.
If you are willing to write a little code the StarTeam API provides the Item.getDeletedTime and Item.getDeletedUserId methods (I believe these showed up in 2006).

Related

GetOpenFileName() custom filtering [duplicate]

Vista introduced an interface: IFileDialog::SetFilter, which allows me to setup a filter that will be called for every potential filename to see if it should be shown to the user.
Microsoft removed that in Windows 7, and didn't support it in XP.
I am trying to customize the our Open file dialog so that I can control which files are displayed to the end user. These files are marked internally with a product-code - there isn't anything in the filename itself to filter on (hence file extension filters are not useful here -= I need to actually interrogate each one to see if it is within the extra filter parameters that our users specified).
I would guess that Microsoft removed the SetFilter interface because too often it was too slow. I can imagine all sorts of similar ideas to this one which don't scale well for networks and cloud storage and what have you.
However, I need to know if there is an alternative interface that accomplishes the same goal, or if I really am restricted to only looking at the file extension for filtering purposes in my File dialogs?
Follow-up:
After looking further into CDN_INCLUDEITEM, which requires the pre-vista version of OPENFILENAME, I have found that this is the most useless API imaginable. It only filters NON-filesystem objects. In other words, you can't use it to filter files. Or folders. The very things one would filter 99.99% of the time for a file open or save dialog. Unbelievable!
There is a very old article by Paul DiLascia which offers the technique of removing each offending filename from the list view control each time the list view is updated.
However, I know from bitter experience that the list view can update over time. If you're looking at a large folder (many items) or the connection is a bit slow (heavily loaded server and/or large number of files), then the files are added to the dialog piecemeal. So one would have to filter out offending filenames repeatedly.
In fact, our current customized file open dialog uses a timer to look at the view's list of filenames periodically to see if any files of a given pattern exist, in order to enable another control. Otherwise it's possible to check for the existence of these files, find none, but a moment later the view updates to have more filenames, and no events are sent to your dialog to indicate that the view has been changed. In fact, my experience with having to write and maintain code for the common controls file dialogs over the years has been that Microsoft is not very cluefull when it comes to how to write such a thing. Events are incomplete, sent at not-useful times, repeated when not necessary, and whole classes of useful notifications don't exist.
Sadly, I think I might have to give up oh this idea. Unless someone has a thought as to how I might be able to keep up with the view spontaneously changing while the user is trying to interact with it (i.e. it would be awkward to go deleting out entries from the list view and changing the user's visual position, or highlighted files, or scroll position, etc.)
You need to initialise the callbacks for your CFileDialog. Then you need to process CDN_INCLUDEITEM notification code to include or exclude items.
You can also check this great article. The author uses some other approaches in addition to callbacks
As you have already discovered, starting in Windows 7 it is no longer possible to filter out files from being displayed based on content, only file extension. You can, however, validate that the user's selected file(s) are acceptable to you before allowing the dialog to close, and if they are not then display a message box to the user and keep the dialog open. That is the best you will be able to do unless you create your own custom dialog.

Event sourcing - delete event related files

I'd like to store some of my data in relative big files (a few GBs per file). I'd like to use event sourcing and save events related to these files, e.g. FileCreated: title, description, timestamp, author, personal, encryptionkey, etc. After a while some of the files won't be needed any longer, and they take up a lot of space. So in order to free up space, I need to delete them. Doing so is problematic, because I will have the history in the event storage, but not the file in the filesystem. Is there any way to keep integrity and somehow delete both? Or is there a best practice for this problem?
Since I did not get an answer, I try to answer this myself.
It is possible to remove an event from the history, you need to create a new event storage and filter the events for the same aggregate id you want to get rid of. After you are done, you can switch to the new event storage and remove the old one. Probably you need to replay projections as well. So it is very similar to a whole migration, it takes a lot of time. In the current case it is not problem if I need to do this only once every year or so. Another problem with storing this data in the event storage that either I stream it from there or I need to duplicate it in order to serve it. The latter one is not always a good solution, because sometimes it takes too much time to copy and in order to save the data you need to stream it anyways, otherwise you will be out of memory very fast. So the event storage should support streaming attachments.
Another solution to keep the relative big data in the files and display something like 404 not found, or file was removed because this and that. I see this frequently. In this case it is ok to keep the event in the storage and for example you can add a ContentRemoved event, where you can select the cause. Another option to hide the removed file, so it won't be listed by the app, this is usual I guess too. This solution has drawbacks too. Migration is more complex with this approach, because you need to move both the event storage and the files. If you remove a file by accident, you cannot undo it later unless you have the file in the backup. This can be fixed by delaying the actual file removal with a few days, so you can undo it if you change your mind. Another option to make a trash and files will be deleted only by emptying the trash.
I think both solutions are worth to consider and probably it depends on the actual project which one is better suited.

How are conflicts handled while synchronizing data in consumer-facing applications?

Let us imagine we have an object D, containing some data. This is modified differently across two different locations, giving rise to data objects D1 and D2. Depending upon the contents, D1 and D2 may be in conflict with each other when being merged back as part of a synchronization process.
Systems such as version control systems simply point out that the two data objects are in conflict with each other and leave it upon the user to manually resolve the conflict.
However, let us now imagine a consumer-facing application, such as a note-taking application that synchronizes contents online. In this case, no user will want to manually resolve conflicts that may have arisen due to the user typing out two versions of the same note with different contents. Discarding the older object for the newer object isn't possible either, since there may be valuable content in the older object that the user wants.
How should I go about resolving such conflicts in a consumer-facing application?
Well, if you don't want manual conflict resolution, then you will have to automatically merge changes from both updates.
There is no way that works well for all applications. When you have a requirement like this, you have to carefully design the application so that automatic merging makes sense.
There are a few common approaches, and you can do one or all of them in various combinations:
1) Merge updates really fast. Think google docs -- updates are merged in real time as people edit. Operational Transformation (https://en.wikipedia.org/wiki/Operational_transformation) is a good way to understand exactly how to do that kind of merging, but it doesn't have to be as complicated as that doc. The reason this works well is that updates are small and you can tell if someone is messing with your stuff before you put a lot of work into it. Politeness fixes conflicts -- one of you will wait until the other is done with that stuff.
2) Locking. If you click the edit button on a note, make lock it so that nobody else can edit it until you're done, etc. This is old-school, and not nearly as slick as (1), but it can work in situations where you can't merge fast enough to do (1).
3) Design your data model and interface to make merged versions as nice as possible. If anyone can add notes, but a note can only be edited by its owner, then no problem, for example. Or maybe you can only edit my stuff if you ask permission first and I give it to you. As things get more complicated than that, this becomes increasingly difficult. It's not usually possible to do this well if you're not willing to make sacrifices in application functionality. You've got one thing on your side, though: It's rude to mess with someone else's work, so a lot of the things you can do look like you did them just to enforce good behavior, and users will thank you for them if you did it with finesse.

Design a share, re-share functionality for a website, avoiding duplication

This is an interesting interview question that I found somewhere. To elaborate more:
You are expected to design classes and data structures for some website such as facebook or linkedin where your activity can be shared and re-shared. Design should be such that it avoids redundancy and duplication.
While thinking of this problem I was stuck on "link vs copy" problem as discussed here
But since the problem states that duplication should be avoided I decided to go "link" way. This makes sharing/re-sharing easier but deleting very difficult. i.e. if the original user deletes their post all the shares should be deleted. (programmatically speaking all the objects on the pointing to the particular activity should be made null. And this is the difficult part here, i.e. to find all the pointing objects)
Wouldn't it be better to keep the shares? The original user deletes
their post, fine, it's gone. But everyone who has linked to it should
not suddenly have it disappear on them.
This could be done the way Unix handles hard links. "Deleting" just
means removing one link to an object -- an inode, in Unix terms. You
don't remove the object itself until the link count is zero.
It's not obvious from the original specification that deletion should work as you describe. It might be desired that when the original user deletes the item, it is not deleted elsewhere; in that case you don't necessarily need to track all references, just keep a reference count on each post, and remove it from the database only when the count hits zero.
If you do want the behavior you describe, it may be achievable by simply removing broken links as and when you encounter them, again relieving you of the need to track each reference. The cost of tracking and updating every reference to every post is replaced with the comparable cost of one failed lookup for each referring page. The latter case is simpler to implement, though, and the cost doesn't hit your server all at once.
In real life, I would implement all references as bidirectional anyway, because it's likely to be needed sooner or later as you add features. For example, a "like" counter seems pretty simple, but to prevent duplicate votes you need to keep track of who has liked each item, and then if you want to remove their "like" when they delete their profile, you need to keep a list of each user's outbound "likes" too.
It takes a lot of database activity to implement something like Facebook...

Implementation of Unsaved Changes Detection

It seems like three ways to approach detecting unsaved changes in a text/image/data file might be to:
Update a boolean flag every time the user makes a change or saves, which would result in a lot of unnecessary updates.
Keep a cached copy of the original file and diff the two every time a save operation needs to be checked.
Keep a stack of all past operations and push/pop operations as needed, resulting in a lot of extra memory usage.
In general, how do commercial applications detect whether unsaved changes exist and what are the advantages/disadvantages of each approach? I ran into this issue while writing a custom application that has special saving behavior and would like to know if there is a known best practice.
As long as you need an undo/redo system, you need that stack of past operations. To detect in wich state the document is, an item of the stack is set to be the 'saved state'. Current stack node is not that item, the document is changed.
You can see an example of this in Qt QUndoStack( http://doc.qt.nokia.com/stable/qundostack.html ) and its isClean() and setClean()
For proposition 1, updating a boolean is not something problematic and take little time.
This depends on the features you want and the size/format of the files, I guess.
The first option is the simplest, and it gives you just what you want with minial overhead.
The second option has the advantage that you can detect when changes have been manually reverted, so that there is no real change after all (although that probably doesn't happen all too often). On the other hand it is much more costly to make a diff just to check if anything was modified. You probably don't want to do that everytime the user presses a key.
The third option gives the ability to provide an undo-history. You could limit the number of items in that history by grouping changes together that were made consecutively (without moving the cursor in between), or something like that.

Resources