Subversion very slow - performance

I am working on a project where the branches folder contain at least 300 different branch (copy of the trunk) which is will no more be used. Since SVN is running more and more slowly I wonder if deleting those branches will make subversion behave operate faster?
Other people in my team say that since the source code will still be on the server, so it wont change anything. (So branch stay undeleted).
But I read something on Subversion before (I dont remember where) saying that HEAD is managed a little bit different that previous version which could increase the speed of the repository.
Which one of these hold true ?

Subversion performance is more related to the load on your server than the size of the repository. Check on disk space and CPU performance, as well as looking into the web server performance (or svnserve on Windows).
If you remove branches, there will still be a repository version that has those branches in it, so they will not be removed. The only way to actually remove content is to dump the repository (svnadmin dump) and then use svndumpfilter to remove the branches in question from the dumped content. The resulting content can be loaded into a new repository without the removed content, and even the revision numbers can be updated.
I am not aware of the HEAD being handled differently in terms of performance. However, copies of the HEAD (or anything else) are cheap, lightweight copies, and should not affect performance.
Can you provide any additional information on which specific operations are slowing down?

Related

SVN - Steps to get all the files from a repository?

We have an existing repository on the network accessed via HTTP:.
Should I first import these files to my local machine? I tried importing directories, files, etc., everything is empty in my local folders. It says "success", but nothing ever shows up!
It doesn't make sense to create a repository on my side. But all the tutorials seem to say that, but then I think they're assuming you're starting from nothing.
My experience with Tortoise SVN has mostly been negative. Typically whatever I think I should do turns out to be incorrect, and I end up having to undo, and redo, or lose my work. Once I even managed to corrupt the main repository and it had to be restored from backup.
I absolutely cannot damage this existing repository!
If you're used to CVS or some older version control systems, note that SVN uses the same terms differently. In those, checkout often means lock in exclusive mode.
In SVN checkout will make a copy and automatically manage the revisions and help you merge from multiple sources. You don't need to lock a file, unless it's graphical or some other binary where merging doesn't make sense.
So in TortoiseSVN, you can checkout, and edit the files. The icons on the files will change to indicate their status.
SVN is easy in comparison to git, where the same terms are again redefined and significantly augmented!

What is the "git stash" equivalent for Serena Dimensions?

I have made some changes. I cannot use those changes now. I need to discard them for now and go back to them later when the star alignment is more favorable (e.g. when our Cobol guy has enough time to get to his half of the work).
Short of using Eclipse → Synchronize with team and manually copy pasting the contents to a scratch directory so I can do the merging later, is there any way to "stash" changes for later?
There is no git stash equivalent on Serena Dimensions. The poor man's way will be to store your changes temporally on a different folder or a file with different name without including it to the source controlled solution and switching back and forth as needed.
Another alternative is to use streams in order to have your changes source controlled without affecting production code; a typical scenario is to have an Integration and Main streams. But it depends on your access level to the dimension database you are using and your project needs.
A git repo can be maintained locally to have this and other git functionality on your local computer (or even small team with shared folders or a git server) since it does not interfere with Dimensions, as long as you don't store the git metadata in the dimensions managed code and vice versa. This is not a straight forward solution and will require that you know how to set a git repo and precaution on you side when delivering to the Dimension server, but it works and is really helpful if you are familiar with git workflow.
Dimensions is not so friendly as git on this kind of usages, but way more robust for larger and more controlled projects.
Git and Dimensions work on different methodologies. Dimensions allows only to either commit a new version or discard the version, after checking out the file. As indicated above, one can still use streams or individual branches for their development work and can merge/deliver the changes later point in time, without affecting others work.

Git 2.2.x updates timestamps of old pack files for no good reason

Git 2.2.0 and 2.2.1 seem to modify the timestamps of old .git/objects/pack/pack-*.pack files occasionally, for no good reason.
It just changes the timestamp; the contents are identical.
Debugging this is difficult as it seems to make changes only fairly rarely.
I have never seen anything like this in any Git version before 2.2.0. What is happening, and can I fix it somehow? Because of the useless timestamp updates I am getting suddenly large amounts of changes for incremental backups.
Git keeps more information on disk than absolutely necessary to record all information in the repository. The unnecessary information is kept to accelerate certain operations and/or avoid having to rewrite files. The algorithm to decide when to delete some of the unnecessary files uses modification time of the pack files as part of the decision process (see find_lru_pack). Therefore mtime is used by a cache-like mechanism in git. Modification time of pack files is changed in git without modifying the file (see freshen_file function) in order to aid the correct caching and avoiding evicting files likely to be used again.
If you modify freshen_file in sha1_file to a no-operation then mtimes should not be ever modified. This will however leave you open to potential data loss if there is a new commit being written with same data as before just as a garbage collection happens (thanks to comment below for pointing this out).
Another approach would be to not backup the git repo itself (with its packfiles), but to backup bundles:
first, you can create incremental bundle or a full bundle of your repo
second, once created, a bundle is one single file, very easy to backup/copy around (less error-prone than an rsync of multiple files, with potential date issue).
the process is easily scriptable (my script does incremental or full backup)

"out of memory - terminating application" error when performing an svn merge

We are seeing the following error when trying to perform a command-line svn merge with Subversion 1.6.9 under 32 bit Windows XP.
Out of memory - terminating application.
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Inspecting Windows task manager around this time reveals the following memory usage
The peak memory usage of the svn.exe process is in excess of 1.8GB.
As an aside, we get the same result when trying to perform the merge using TortoiseSVN.
We are trying to perform the merge from the root level of our repository. The total file size (on a developer machine) of the repository is around 3GB.
This is the first time that we are attempting a root-level merge. Are we hitting an internal svn limit?
Edit
After some trial-and-error investigation I've found that this problem seems to be caused by one specific folder in our repository. This folder contains 1,500 SQL scripts. Performing a merge on just this folder results in the same out of memory error (although it takes longer to blow up).
We were able to fix this issue, though we still don't understand the precise nature of the cause.
As stated in my post Edit, we tracked the issue to a single folder that contained around 1500 SQL scripts. This folder also had an svn:externals of a single file.
We performed the following steps:
deleted this svn:externals and did an svn commit
deleted the working copy of the folder (there seems to be an issue whereby if you remove an svn:externals property that referenced a single file that the file that was external'd into the folder does not get removed on a subsequent svn update)
performed an svn update
When we next attempted an svn merge the command completed successfully.
Given the memory usage you're seeing (~1.8GB) prior to termination and the fact you are on 32-bit Windows, which has a 2GB memory limit per process, I'd recommend attempting the merge on a 64-bit machine.
If you don't have a 64-bit machine available, try breaking it up into smaller merges (unless you're reintegrating a branch, then I'm not sure how you'd split that up).
I've dealt with 3GB working copies and lots of merging (always from the root of the branch) - I've never hit a memory issue, but I've also been 64-bit for a long time. It's conceivable that merging a branch with a lot of changes could require a lot of memory, but I'm just speculating.
I suggest posting to the subversion mailing list, and a quick search tells me you already have :) I suspect they will confirm that you need more memory to do a big merge given the size of the repository, but it's also possible something else is going on.
Separate suggestion: search the mailing list for similar problems.
I found a thread about too many mergeinfo properties causing increased memory usage. If you've never merged from the root before, I assume you have a lot of mergeinfos.
UPDATED: It's important to understand what svn:mergeinfos are. Exercise caution in removing them without some understanding. In Richard's case, the repository never had merges committed from the root of the branches, which means svn:mergeinfo at the root probably did not contain anything, so removing them all will remove svn's knowledge of what was previously merged. This matters when doing full branch merges (e.g. svn merge url/to/src/branch - where no revision is specified), and may cause subversion to try to re-merge revisions that were previously merged. Cherry-pick merges (i.e. specifying revisions x,y,z) should not be affected. Even if it's all removed, it isn't the end of the world it's just you'll have an svn 1.4 like behavior for whole branch merges that involve branches before that point in time.
That said, I've had to clean up extraneous subtree mergeinfos multiple times before - just not from the root.
Mergeinfo recommended reading:
Subversion 1.5 Mergeinfo - Understanding the Internals
Where Did That Mergeinfo Come From?
Are you running a 64bit build of SubVersion on 64bit windows?
If not please bear in mind the restriction on memory size in Windows. You are restricted to 3GB for the use of the process, so that may be the limit you are hitting.
By the way the latest TortoiseSVN is 1.6.13

Getting an infinite "undo stack" without committing to the repository?

Like many programmers, I'm prone to periodic fits of "inspiration" wherein I will suddenly See The Light and perform major surgery on my code. Typically, this works out well, but there are times when I discover later that — due to lack of sleep/caffeine or simply an imperfect understanding of the problem — I've done something very foolish.
When this happens, the next step is do reverse the damage. Most easily, this means the undo stack in my editor… unless I closed the file at some point. Version control is next, but if I made changes between my most recent commit (I habitually don't commit code which breaks the build) and the moment of inspiration, they are lost. It wasn't in the repository, so the code never existed.
I'd like set up my work environment in such a way that I needn't worry about this, but I've never come up with a completely satisfactory solution. Ideally:
A new, recoverable version would be created every time I save a file.
Those "auto-saved" versions won't clutter the main repository. (The vast majority of them would be completely useless; I hit Ctrl-S several times a minute.)
The "auto-saved" versions must reside locally so that I can browse through them very quickly. A repository with a 3-second turnaround simply won't do when trying to scan quickly through hundreds of revisions.
Options I've considered:
Just commit to the main repository before making a big change, even if the code may be broken. Cons: when "inspired", I generally don't have the presence of mind for this; breaks the build.
A locally-hosted Subversion repository with auto-versioning enabled, mounted as a "Web Folder". Cons: doesn't play well with working copies of other repositories; mounting proper WebDAV folders in Windows is painful at best.
As with the previous method, but using a branch in the main repository instead and merging to trunk whenever I would normally manually commit. Cons: not all hosted repositories can have auto-versioning enabled; doesn't meet points 2 and 3 above; can't safely reverse-merge from trunk to branch.
Switch to a DVCS and "combine" all my little commits when pushing. Cons: I don't know the first thing about DVCSes; sometimes Subversion is the only tool available; I don't know how to meet point 1 above.
Store working copy on a versioned file system. Cons: do these exist for Windows? If so, Google has failed to show me the way.
Does anyone know of a tool or combination of tools that will let me get what I want? Or have I set myself up with contradictory requirements? (Which I rather strongly suspect.)
Update: After more closely examining the tools I already use (sigh), it turns out that my text editor has a very nice multi-backup feature which meets my needs almost perfectly. It not only has an option for storing all backups in a "hidden" folder (which can then be added to global ignores for VCSes), but allows browsing and even diffing against backups right in the editor.
Problem solved. Thanks for the advice, folks!
Distributed Version Control. (mercurial, git, etc...)
The gist of the story is that there are no checkouts, only clones of a repository.
Your commits are visible only to you until you push it back into the main branch.
Want to do radical experimental change? Clone the repository, do tons of commits on your computer. If it works out, push it back; if not, then just rollback or trash the repo.
Most editors store the last version of your file before the save to a backup file. You could customize that process to append a revision number instead of the normal tilde. You'd then have a copy of the file every time you saved. If that would eat up too much disk space, you could opt for creating diffs for each change and customizing your editor to sequentially apply patches until you get to the revision you want.
if you use Windows Vista, 7 or Windows Server 2003 or newer you could use Shadow Copy. Basically the properties window for your files will have a new tab 'previous version' that keeps track of the previous version of the file.
the service should automatically generate the snapshot, but just to be safe you can run the following command right after your moment of "inspiration"
'vssadmin create shadow /for=c:\My Project\'
it has defiantly saved my ass quite a few times.
Shadow Copy
I think it is time to switch editors. Emacs has a variable version-control, which determines whether Emacs will automatically create multiple backups for a file when saving it, naming them foo.~1~, foo.~2~ etc. Additional variables determine how many backup copies to keep.

Resources