Check in - Check out process/version control for PSDs and Image files - image

The title may not be so clear but the issue I am facing is this:
Are designers are working on large photoshop files across the network, this has a number of network traffic and file corruption issues which I am trying to overcome.
The way I want to do this is to have the designers copy the the files to their machine (Mac OSX) and work on them locally. But the problem then stands that they may forget to copy them back up or that another designer may start work on the version stored on the network.
What I need is a system where the designer checks out the files or folders from the server which locks those files so no other user can copy them until they are checked back in. We do not need to store revisions for the files.
My initial idea was to use SVN or preferably GIT and force lock on checkout somehow, does this sound feasible or is there a better system?

How big are the files on average? Not sure about GIT haven't used it but SVN should be ok - If you did go with SVN I would trial checking out over Http/Https vs Network Path to the repo as you may get a speed advantage out of one or the other. When we vpn to our repo at work it is literally 100 times faster over http than checking out using a network \\path to the repo.

SVN is a good option, but you will have revisions (this is the whole point of SVN). SVN doesn't lock files by default, but you may configure it so that it does. See http://svnbook.red-bean.com/nightly/en/svn-book.html?bcsi_scan_554E00F99A9AD604=0&bcsi_scan_filename=svn-book.html#svn.advanced.locking
I don't know git very well, but since it's not a centralized VCS, I'm pretty sure it isn't the right tool for your situation.

Related

How do I stop OneDrive from downloading git.exe on Windows?

I have used Git on Windows for a while, but recently changed the setting and got this.
On almost every command for Git Bash (also on PowerShell and Github Desktop) I get
git.exe is being downloaded on OneDrive
(translation may not be exactly the same)
The setting that changed recently is moving my repos to a OneDrive folder in order to have them synced between two sessions: that is work desktop and remote virtual machine.
I can see that this may not be ideal, but it really works for me since I have the same settings on both sessions, and not really get used to doing many commit-push-pull. Not the main topic here, but feel free to comment.
(Edit): Upon reading solution, there are other ways to set this syncing that doesn't mess up with the internals of Git. Look for that instead. Thanks.
In any case, the strange thing is that the notifications happen only on the Remote Virtual Machine, but not on the desktop.
I have seen some notifications about some files in the repos, which I then attribute to OneDrive being nosy about every move I make file I move. But then I've also seen files I don't know about, and theres always git.exe attached to the notification.
In the first scenario I have tried tuning down the notifications for OneDrive. Some might say Microsoft does have a background for not letting users setup their notifications, so I'm still looking.
Thanks.
Most file syncing tools like OneDrive and Dropbox operate by syncing data file by file. This is a great approach if you're working on a single word-processing document or spreadsheet. However, it's not as great when you're working with a Git repository.
When changing between branches or making a commit, Git changes and creates a lot of files all at once. In order to be synced correctly, all of the created files must be written in a similar order: all the blobs must be written, then the trees, then the commits, and then the refs can be updated. If you do this out of order, your repository can be corrupted, since you can have branches that refer to objects that don't exist (or objects that refer to other objects that don't exist).
In addition, these tools can end up deleting files you wanted to have in your working tree or recreating files you didn't. So overall, you don't want to sync any Git repository using one of these tools.
You can write a bundle file with git bundle and sync that, or you can use rsync to sync a repository provided it's idle (not being modified) when you do. Note that if you sync a working tree, Git will need to refresh all files when you sync it across to the new machine, and also Git doesn't try to defend against untrusted users who have access to the working tree.
It's also not a good idea to sync your Git installation itself via OneDrive, which is what it sounds like might be happening. Instead, install Git for Windows on each machine independently and don't try to sync it across. OneDrive should have configuration options that let you control what's synced.

SVN - Steps to get all the files from a repository?

We have an existing repository on the network accessed via HTTP:.
Should I first import these files to my local machine? I tried importing directories, files, etc., everything is empty in my local folders. It says "success", but nothing ever shows up!
It doesn't make sense to create a repository on my side. But all the tutorials seem to say that, but then I think they're assuming you're starting from nothing.
My experience with Tortoise SVN has mostly been negative. Typically whatever I think I should do turns out to be incorrect, and I end up having to undo, and redo, or lose my work. Once I even managed to corrupt the main repository and it had to be restored from backup.
I absolutely cannot damage this existing repository!
If you're used to CVS or some older version control systems, note that SVN uses the same terms differently. In those, checkout often means lock in exclusive mode.
In SVN checkout will make a copy and automatically manage the revisions and help you merge from multiple sources. You don't need to lock a file, unless it's graphical or some other binary where merging doesn't make sense.
So in TortoiseSVN, you can checkout, and edit the files. The icons on the files will change to indicate their status.
SVN is easy in comparison to git, where the same terms are again redefined and significantly augmented!

Mercurial: is it possible to compress .hg folder to several large BLOBs?

Issue: cloning mercurial repository over network takes too much time (~ 12 minutes). We suspect it is because .hg directory contains a lot of files (> 15 000).
We also have git repository which is even larger, but clone performance is quite good - around 1 minute. Looks like it's because .git folder which is transferred over network has only several files (usually < 30).
Question: does Mercurial support "repository compressing to single blob" and if it does how to enable it?
Thanks
UPDATE
Mercurial version: 1.8.3
Access method: SAMBA share (\\server\path\to\repo)
Mercurial is installed on Linux box, accessed from Windows machines (by Windows domain login)
Mercurial use some kind of compression to send data on the network ( see http://hgbook.red-bean.com/read/behind-the-scenes.html#id358828 ), but by using Samba, you totally bypass this mechanism. Mercurial thinks the remote repository is on a local filesystem and the mechanism used is different.
It clearly says in the linked documentation that each data are compressed as a whole before sending :
This combination of algorithm and compression of the entire stream
(instead of a revision at a time) substantially reduces the number of
bytes to be transferred, yielding better network performance over most
kinds of network.
So you won't have the problem of 15'000 files you use a "real" network protocol.
BTW, I strongly recommend against using something like Samba to share your repository. This is really asking for various kind of problems :
lock problems when multiple people attempt to access the repository at the same time
file right problems
file stats problems
problems with symlink management if used
You can find information about publishing repositories on the wiki : PublishingRepositories (where you can see that samba is not recommended at all)
And to answer the question, AFAIK, there's no way to compress the Mercurial metadata or anything like that like reduce the number of files. But if the repository is published correctly, this won't be a problem anymore.
You could compress it to a blob by creating a bundle:
hg bundle --all \\server\therepo.bundle
hg clone \\server\therepo.bundle
hg log -R therepo.bundle
You do need to re-create or update the bundle periodically, but creating the bundle is fast and could be done in a post-changeset hook on the server, or nightly. (Since fetching remaining changesets can be done by pulling the repo after cloneing from bundle, if you set [paths] correctly in .hg/hgrc).
So, to answer your question about several blobs, you could create a bundle every X changesets, and have the clients clone/unbundle each of those. (However, having a single one updated regularly + a normal pull for any remaining changesets seems easier...)
However, since you're running Linux on the server anyway, I suggest running hg-ssh or hg-web.cgi. That's what we do and it works well for us. (With windows clients)

DropBox as Version Control and Offsite Backup

After reading Michael Lopp's book "Being Geek," I started using Dropbox as a means of synchronizing files between my home computer and work computer. It's been fantastic, it really makes it painless to keep track of the latest version of files you're working on.
My question has to do with people's experience with this tool, especially programmers who may have used it to develop larger projects.
Right now, I see 3 main uses of Dropbox:
1. synchronize files between home and work computers
2. version control (you have to log into the dropbox site to access previous versions)
3. off-site backup
Right now I'm using it as my main backup tool, which I'm not sure is a good idea. But right now I have a local (working) copy of my entire project "checked out" on each computer (my home laptop and my work computer), and additionally, my entire project is kept on the dropbox site. So I'm thinking, if anything happens to one of my computers, or both, I'll still have that off-site backup available and I'll simply have to reinstall dropbox to access all my files.
Does anyone have experience with doing this? Has anyone done a major file recovery using dropbox? Or is this even widely used? Thanks for your feedback in advance.
Using Dropbox to maintain several files and its associated metadata when those files are historized in a VCS is always a bit tricky because of potential corruption issue (if one of those metadata part of the repository isn't correctly synchronized, you can end up with a non_working repo)
That is why I always use with DropBox:
a DVCS (like Git): I can work directly in a working tree within a DropBox repo or I can clone said repo anywhere else outside the DropBox if I need to,
a single bundle file to which I can push at any time the changes from my local repo, wherever that repo might be.
That way, the only file that really need to be in sync in DropBox is that unique bundle file (representing a bare repo as one file).
See "Git with DropBox" for more.

Concurrency in a GIT repo on a network shared folder

I want to have a bare git repository stored on a (windows) network share. I use linux, and have the said network share mounted with CIFS. My coleague uses windows xp, and has the network share automounted (from ActiveDirectory, somehow) as a network drive.
I wonder if I can use the repo from both computers, without concurrency problems.
I've already tested, and on my end I can clone ok, but I'm afraid of what might happen if we both access the same repo (push/pull), at the same time.
In the git FAQ there is a reference about using network file systems (and some problems with SMBFS), but I am not sure if there is any file locking done by the network/server/windows/linux - i'm quite sure there isn't.
So, has anyone used a git repo on a network share, without a server, and without problems?
Thank you,
Alex
PS: I want to avoid using an http server (or the git-daemon), because I do not have access to the server with the shares. Also, I know we can just push/pull from one to another, but we are required to have the code/repo on the share for back-up reasons.
Update:
My worries are not about the possibility of a network failure. Even so, we would have the required branches locally, and we'll be able to compile our sources.
But, we usually commit quite often, and need to rebase/merge often. From my point of view, the best option would be to have a central repo on the share (so the backups are assured), and we would both clone from that one, and use it to rebase.
But, due to the fact we are doing this often, I am afraid about file/repo corruption, if it happens that we both push/pull at the same time. Normally, we could yell at each other each time we access the remote repo :), but it would be better to have it secured by the computers/network.
And, it is possible that GIT has an internal mechanism to do this (since someone can push to one of your repos, while you work on it), but I haven't found anything conclusive yet.
Update 2:
The repo on the share drive would be a bare repo, not containing a working copy.
Git requires minimal file locking, which I believe is the main cause of problems when using this kind of shared resource over a network file system. The reason it can get away with this is that most of the files in a Git repo--- all the ones that form the object database--- are named as a digest of their content, and immutable once created. So there the problem of two clients trying to use the same file for different content doesn't come up.
The other part of the object database is trickier-- the refs are stored in files under the "refs" directory (or in "packed-refs") and these do change: although the refs/* files are small and always rewritten rather than being edited. In this case, Git writes the new ref to a temporary ".lock" file and then renames it over the target file. If the filesystem respects O_EXCL semantics, that's safe. Even if not, the worst that could happen would be a race overwriting a ref file. Although this would be annoying to encounter, it should not cause corruption as such: it just might be the case that you push to the shared repo, and that push looks like it succeeded whereas in fact someone else's did. But this could be sorted out simply by pulling (merging in the other guy's commits) and pushing again.
In summary, I don't think that repo corruption is too much of a problem here--- it's true that things can go a bit wrong due to locking problems, but the design of the Git repo will minimise the damage.
(Disclaimer: this all sounds good in theory, but I've not done any concurrent hammering of a repo to test it out, and only share them over NFS not CIFS)
Why bother? Git is designed to be distributed. Just have a repository on each machine and use the publish and pull mechanism to propagate your changes between them.
For backup purposes, run a nightly task to copy your repository to the share.
Or, create one repository each on the share and do your work from them but use them as distributed repositories from which you can pull changesets from each other. If you use this method, then performance of doing builds and so on will be decreased since you will be constantly accessing over the network.
Or, have distributed repositories on your own computers, and run a periodic task to push your commits to the repositories on the share.
Sounds just as if you'd rather like to use a centralized versioning system, so the query for backup is satisifed.
Perhaps with xxx2git in between for you to work locally.

Resources