is it ok to copy a git folder in windows to manage multiple branches? - windows

I'm new to git and have a git repository that I use with GitKraken.
In this repository I have multiple branches, and can move from branch to branch in order make modifications where necessary.
I am now in a situation where I'll be making some large modifications to 1 branch that I do not want to commit but in the meantime I would like to make some minor modifications to another branch.
I'm used to work with TFS and there I can just checkout branch to another folder.
I've tried to just copy the folder and my first impression is that this should work....
But, I have seen online remarks that say that I should clone a repository instead.
The git version is lower then 2.5 so I can't use Git-worktree.
Is it ok to just copy the folder or can this have an unexpected effect?

Yes, if you copy the whole folder from the root of the checkout, including the hidden .git folder, then you can make changes to each working copy independently. Each contains their own copy of the repository objects and they will behave exactly as if you have run two separate clones.
As discussed in the comments this isn't necessarily a good use case for this, though: it would be easier (and more disk-space-efficient) to commit your large changes to a local branch so that you can then switch and make other changes. There's no real downside to this; if you do want to remove that temporary commit later then that's easily done as well.
However if you are going to do this, then you probably want to
run a git repack -ad first, so that there are fewer files in the objects tree to copy
consider using git clone --reference instead, which might be slightly more disk-space-efficient
or you want a clean working copy you can create a new working copy folder, copy only the hidden .git folder into the new working copy and then git reset --hard to check out all of the files there too.

You may want to see if git stashing will work for you. I don't recommend copying to a new folder. Mostly because I don't know if it's even possible and I've never seen that as a recommendation. Cloning should also work but it sounds like you are interested in shelving/stashing vs. committing your changes in branch1 before checking out branch2.
https://git-scm.com/book/en/v1/Git-Tools-Stashing

Related

How to use Git commands to add multiple separate folders to GitHub such that each folder can be updated seperately

so I am new to this whole GitHub and Git thing. I recently learned the basics of Git (adding, pushing, pulling, cloning, etc). My intro to Java professor asked me to make a git hub repository for all my class homework. She told me to organize it in such a way that there are separate folders for each homework and each homework folder contains multiple source files.
So I set up my files like this:
Java(main folder) -> Hw1 + Hw2 + Hw3 etc. How would I do this using git? All of these folders should be on my local and git hub repositories and I should be able to make changes to them separately.
Thank You in advance. I am stuck.
Let's start with some basics. You already understand that your computer uses a tree-structured file system: that is, a directory (or folder—the terms are now interchangeable) holds files and/or more directories/folders, which in turn hold more files and/or folders, etc. Windows natively uses a backwards slash \ to separate the various components, so that you might have:
java\hw1\main.java
java\hw1\sub.java
java\hw2\main.java
and so on. Windows can use forward slashes (some commands may use them for other purposes, but they do work in file names), and all non-Windows OSes tend to use forward slashes, which are easier to type. Git also uses forward slashes so that's what I'll do here.
(Aside: Windows and macOS by default use "case insensitive but case preserving" rules, so that if you create a file named readme.txt, you can later open it using the name ReadMe.txt or README.TXT but it remains named in all-lowercase. Git, by contrast, is usually case-sensitive and thinks that readme.txt, ReadMe.txt, and README.TXT are three different file names. This causes endless grief on such systems1 and sometimes the best, or at least easiest, way to avoid all problems here is to completely avoid uppercase letters everywhere. To the extent that you can use java instead of Java, hw1 instead of Hw1, and so on, I would encourage you to do so.)
When you ask Git to create a new, empty repository using git init,2 Git creates a hidden folder named .git. This hidden folder will contain all of Git's files: here, Git will store its main two databases. We'll talk about those in just a moment. The place where Git creates .git is whatever your current working directory is, so if you are in java/hw1 and run git init, Git creates java/hw1/.git. If you are in java and run git init, Git creates java/.git.
Note that java/.git and java/hw1/.git are different folder path names, and therefore you can create two repositories. You do not want to do this, but that's what you did. (I base this claim on this comment.) We'll come back to "how to fix this" soon.
1In particular, someone using Linux can literally create three different files that differ only in case, stuff all three into a commit in a Git repository, and leave you with a problem when you go to check out this commit on a Windows system. If you're used to the system mapping from typed-in-lowercase to the matching case, and you ask an editor to create java/hw1/thing.java on Linux, it might actually create a java and hw1 right next to your existing Java and Hw1. Since those are different directories they can store different files with the exact same names as those in Java/Hw1/, including name-case. Git will happily store all these files, and Windows often cannot extract such a commit properly.
2Note that git init will first check to see if you're already in some existing repository. In this case, rather than creating a new repository, Git will "reinitialize" the existing repository. In most cases "reinitializing" like this has no effect at all.
The main thing to know about Git and a Git repository
A Git repository—or what I sometimes call the repository proper—consists mainly of two databases. One is usually much bigger. It contains commits and other supporting Git objects. These objects all have hash IDs (or more formally, object IDs or OIDs) that Git must have in order to retrieve the objects from the database. This could force humans to memorize Git commit hash IDs, but that's a bad plan: hash IDs are very large, very random-looking, and impossible for humans to remember in general.
For this reason, a Git repository contains that second, usually much smaller, database. In this database, Git stores names: branch names, tag names, remote-tracking names, and many other kinds of names. These names are for you (and other humans) to use. Each name stores one hash ID, but that's enough to make everything work. So you'll use a branch name, like main or master. This name holds the hash ID of the latest commit, which allows Git to retrieve that commit.
Each commit stores two things:
A commit stores a full snapshot of every file (that Git knew about, that is) at the time you, or whoever, made that commit. The files inside the commit are stored in a special, read-only, Git-only, compressed and de-duplicated form, that only Git can read, and literally nothing can write. (This uses some of those "supporting objects" I mentioned; the files are actually stored in the objects database as "Git objects".) Because nothing but Git can use these files, the files in a commit are useless on their own. We'll see in just a moment how we work with these files.
Meanwhile, that same commit that's storing a snapshot, also stores some metadata, or information about the commit itself: who made it (you, probably), and when, for instance. To make "branches"—a poorly-defined word in Git (see What exactly do we mean by "branch"?)—work, the commit's metadata contains the hash ID of the previous commit.
This "contains previous commit's hash ID" is how Git stores history: the branch name, e.g., main, lets Git find the last commit you made, and then by reading that commit, Git can find the hash ID of the second-to-last commit. For instance, suppose the hash ID of the last commit is H (it's actually some big ugly hexadecimal number so we're just using H to stand in for it). Then we say that the name main points to commit H. But commit H contains the hash ID of an earlier, or parent, commit: let's call that one G. We say that H points to G, and we can draw that:
<-G <-H <--main
Since G is a commit, it has one of these points-to pointers sticking out of it, too. By reading commit G's metadata, Git can find the raw hash ID of its parent; let's call that commit F:
... <-F <-G <-H <--main
So main points to H, which points to G, which points to F, which points to ... well, this goes on until we get back to the very first commit ever—commit A perhaps—which, being first, can't point backwards and therefore simply doesn't.
What this means is that instead of one hash ID, each commit stores, in its metadata, a list of previous-commit hash IDs. The list can be empty, and is for that first commit. It can also have more than one hash ID, but we won't cover this case here. Most commits in most repositories are "ordinary" commits and have exactly one parent, though.
Your "working tree"
A repository, then, stores names—branch names for instance—that help Git find commits for us (we only have to remember the branch names), and stores commits that then store files. But the stuff in the commits (along with the actual commits themselves) is all completely read-only. Git must do this to make the hashing scheme work. What good are stored files if we can't write on them? Moreover, only Git can read them, so what good are they if we can't even read them?
This is where your working tree comes in. Most Git repositories have a working tree.3 The working tree of a repository is, quite simply, where you do your work. And, as we saw earlier, if you use git init in some directory to create a new, totally-empty repository and then make an initial commit:4
mkdir new
cd new
echo example > README.txt
git init
git add README.txt
git commit
you will wind up with a hidden .git folder here in the new/ folder we just made (mkdir new) and entered (cd new). The working tree for this Git repository in new/.git is new/, and the file we created—README.txt—in that working directory is now also stored in the first (and so far only) commit in that repository.
If we now modify the one file, and/or add a new file, and use git add and git commit appropriately, we'll get a second commit that stores (forever5) the new versions of that file. That second commit has, as its parent commit, the first commit, which stores (forever) the earlier version with just the one file in it.
The second commit is now our current commit, and is now the last commit on the main or master branch (whatever its name is).
Git allows us to check out any commit we have stored in the repository. When we do that, Git will erase from our working tree the files that go with the current commit. It will, instead, install into our working tree the files that go with the newly selected commit—which then becomes the current commit.
In this way, we can "go back in time", any time we like, to any older version, stored as a commit in the big database. All we have to do is find its commit hash ID (for which git log comes in handy, for instance). That's not what we'll focus on right now though.
3The exception here is a so-called bare repository. We won't cover these here.
4These are Unix-shell-style commands as I don't use Windows myself, but this should work in git-bash, which is just a port of bash to Windows for use with Git. You can do all this in PowerShell or even CMD.EXE instead, but some command details might change.
5Well, forever, or as long as the commit itself continues to exist. If we remove the commit, we remove its snapshot. This is actually kind of hard to do! However, if we remove the repository proper, we destroy the two databases, which removes all commits, and this is pretty easy to do.
"Nested" repositories: the thing you didn't want, but made
Given that the computer—the host operating system, which is in your case Windows, but this is also true of macOS and Linux—demands and uses a tree-structured file system, we can set up a structure like this:
java
.git
<various Git repository control files and databases>
hw1
.git
<various Git repository control files and databases>
main.java
hw2
.git
<various Git repository control files and databases>
main.java
and so on. Here we have one repository per hw directory plus one overall containing repository in the java directory.
But here's the problem: Git literally cannot store a Git repository inside a Git commit.6 Instead of doing so, the "outer" repository—in this case the one in java/.git, whose working tree is the java/* files—will store what Git calls a submodule using what Git calls a gitlink. To store a submodule correctly, you must use git submodule add, not git add; git add creates or updates only the gitlink, which is sort of half a submodule.
If someone does want submodules (but you don't), this git submodule add method is how to make them. The result is that when you clone the java repository, you get files, plus the magic gitlinks, that Git will need in order to run additional git clone commands, one for each submodule. This way, the person who clones the java repository can run git submodule update --init to run a bunch more git clone commands. But again, that's not what you want.
6There are some tricks to get around this problem if you really need to do it, but it's not a good idea in general. The recent safe.directory stuff is an outgrowth of a security issue that resulted in a CVE when someone discovered such a trick. The tricks that Git allows involve renaming the .git directory; the ones it doesn't allow, or accidentally allowed in the past, result in CVEs. 😀
Fixing the mess
The observations we should make at this point are these:
Git stores commits. It doesn't store files (though commits do store files). It stores commits.
What you want is a single repository with multiple commits, where the first commit—or maybe second; see below—contains a file named hw1/main.java,8 but no files named hw2/whatever.
What you have now are multiple repositories: one, a superproject, with submodules (or half-submodules) named hw1, hw2, and so on, and then more repositories that get cloned into hw1, hw2, and so on, each containing a main.java and whatever other files.
Now, if we assume (or you verify) that you do not need to save any of the commits in any of these repositories so far, what we can do is simply delete all the .git folders and their contents.
That is, on a Unix-like shell, we would run:
cd java
ls # make sure we're in the right place
rm -rf .git # remove this working tree's Git repository
rm -rf hw1/.git # remove the Git repository in hw1/
rm -rf hw2/.git # and so on ...
Note that we're using the OS's remove command, with the "remove everything without asking" options, on the hidden Git folders. Git has no opportunity to stop us: we're totally bypassing Git here. All of Git's files, including the two big databases, get completely removed. This is likely to be irrecoverable (depending on your OS and whether you're using the OS's "remove irrecoverably" command, or its "move to trash so I can get it back if I change my mind" command, and also depending on whether you have good backups, e.g., macOS Time Machine).
We now have only all of the working trees, with no .git folders: there are no repositories left. But all of the files are still there because the checked-out files were, and still are, in the working trees.
Now we create one new, totally-empty repository in the java directory, that we're still in:
git init
[Git prints message: Initialized empty Git repository in ...]
We now have our initial, totally-empty repository. I like to create a first commit that contains just a README.txt (and maybe one or two similar files):
echo repository for "insert class name here" > README.txt
git add README.txt
git commit -m "Initial commit"
We're now ready to "complete" homework assignment #1:
git add hw1
git commit
(write a good, proper commit message in editor)
By running git add hw1 when there's no Git repository inside hw1, we add all the files that are in hw1 (including any files in any subdirectories inside hw1).
The git commit command commits what's been stored so far, as updated by our git add. So when we commit the addition of hw1, we get README.txt—which we didn't change, so this commit literally re-uses the previous version of the file—plus all the hw1/* files.
We can now "complete" homework assignment #2 with git add hw2 and committing, and so on. We end up with a single repository in the java/.git directory, containing multiple commits: an initial one with the README file and subsequent ones with each homework assignment added. There is just the one branch name and it holds the hash ID of the last commit.
Pushing this to GitHub
Your last problem here is that if you have already created a GitHub repository and put some commits in it, your existing GitHub repository is going to be reluctant to lose those commits. You have several options:
You can keep those commits, if you really want to.
You can tell GitHub to completely delete that repository, then create a new one with the same name.
Or, you can use git push --force from your laptop (or other computer) that has your new repository, so as to command the Git software on GitHub to go ahead and lose the old commits from the old repository.
The general idea here, with the last option, is that we (and Git) find commits by starting from some branch name like master or main. That gives us the hash ID of the last commit, and from there, we have Git work backwards.
Suppose we command (not just ask) some GitHub repository to take a new chain of commits. That is, they had:
A <-B <-C <--main
We now make a totally new (empty) repository, and put in two commits: an initial commit D and a second commit E, neither of which have the same hash ID as any of those three commits in the original repository:
D <-E <--main
We run git remote add origin url to set things up so that we can git push to GitHub. If we run:
git push origin main
our Git will send commits D and E to GitHub, then politely ask if they can add commits D-E to their repository. But that would give them:
A <-B <-C
D <-E <-- main
which, they notice, will mean they no longer have any name by which to find commit C, which means they'll "lose" all three hash IDs. So they will say No! If I do that I'll lose my access to some of my commits!
Your Git software reports this as ! [rejected] main -> main (non-fast-forward): it means they are saying they could lose commits. But that's exactly what you want: you want them to lose A-B-C; those commits are no good! So you can use git push --force origin main, which sends D-E again but this time commands them to make their main point to E.
You have to have permission—GitHub add a whole set of permissions that base Git fails to provide—but if you own this GitHub repository, you probably will already have the right permissions.9 So they'll obey: they will make their branch name main point to commit E, and "forget" commits A-B-C.10
8Note that while your OS demands folders with subfolders and files, Git just stores "files with long names that have slashes in them". Git understands the folder-y requirements your OS makes, and can turn hw1/main.java into "file main.java in folder hw1. It will automatically save the OS's hw1/main.java—a file named main.java in a folder named hw1—as the Git file named hw1/main.java.
Normally, you don't need to worry about this whole mess. The time when you do have to worry about it is when you want to store an empty folder in Git, because Git literally can't do that. Git only stores files. There are some tricks for this though: see How can I add a blank directory to a Git repository?.
9If you own the repository, the only way you wouldn't have permissions is if you logged on to GitHub and told them to deny permission to yourself. To fix that, log on to GitHub again and tell them to give permission back to yourself.
10"Normal" Git setups really do eventually forget (or lose) commits this way. GitHub, however, have their software set up to retain all commits forever. So if you send a bad commit to GitHub, and for whatever reason, you really need it removed, you must contact GitHub support and get them to scrub it off their systems.

Checked out a repo from remote but when I do a git status a file shows up as modified — how to fix?

I am using Windows and Git and I had modified a file. No matter how many times I did a git add and commit, the file kept showing up as modified and i could not for example do a git pull --rebase. I assume I did something wrong and screwed up the local Git repo so I decided to clone the repo from github, into a completely new directory. To my surprise, even in this new directory tree when I do a git status the same file shows up as modified -- it is as if it is somehow modified in the github (remote) repo which does not make sense to me. Moreover, the version of the file in cloned local repo does not have the latest version of the code that i can see when i look at the code on github. How can i fix this? I am concerned that someone else cloning the code will end up with the same problem. (Apparently only I am seeing this problem -- I did not somehow manage to corrupt the github repo which leads me to believe this is a git/windows issue.) As far as what I think I did wrong is when I modified a file and did a git add, i misspelled the directory path by using a lower case letter instead of an uppercase and then adding one file resulted in the other, properly spelled path showing up as modified and vice versa. I don't know if a symlink on windows got created -- the file contents are identical. But one would think cloning (via Eclipse) into a completely new directory tree would make this a non-issue.
I looked through replies but it seems like the basic problem is Window's case insensitivity and this caused some (to me) weird behavior. In particular, I simply could not delete one of the folders -- they were "entangled." So the simple solution was to delete the folder and its contents from unix which is case sensitive. Then I checked out a fresh repo and problems appear to be completely resolved.
You mentioned in a comment that you discovered one commit containing two problematic files: one named Login/Login.tsx and one named login/Login.tsx. This comment is on a related question; see my answer there for a discussion of Git's method of naming files in its index, vs what your OS requires in your working tree.
Your solution—use a Unix or Linux machine, where you get a case-sensitive file system, to repair the situation—is probably the easiest and best way to deal with this. If you can establish a case-sensitive file system on your own machine, that also allows easy dealing with this (see my answer to another related question for a macOS-specific way to make a case-sensitive file system).
Given that what you wanted was simply to delete one of the spellings, though, git rm should allow you do that. In particular git rm --cached login/Login.tsx would drop login/Login.tsx from Git's index, without affecting Login/Login.tsx. This could leave your working tree with an existing login folder, though.
It's important—at all times, really, but especially when working within a situation like this—to realize that Git itself doesn't actually need or use your working tree to make new commits. Each commit contains a full snapshot of every file that Git knows about. These files exist as "copies" in Git's index.1 Hence there are actually three copies of each file:
A frozen version of each file appears in the current commit (whatever that commit's hash ID is).
A "copy" (see footnote 1) of that version appears in Git's index. You can replace this copy with different content, and the read-only copy in the commit doesn't change. You can remove this copy entirely, and the read-only copy still doesn't change. Nothing in any existing commit can or will ever change. The index copy exists precisely so that you can replace it, or remove it, or whatever. In effect, the index—or staging area, if you prefer this term—acts as your proposed next commit. It's merely filled in from a commit.
Finally, there's a regular, ordinary, everyday file. This copy goes into your working tree or work-tree. To put this copy in place, Git must use your OS's file-manipulation facilities. That may require creating folders and files within the folders. If those are case-insensitive, and Git goes to create a Login folder when a login folder exists, or vice versa, the OS will say: nope, sorry, already exists. Git will do its best to accommodate the OS by using the "wrong" case anyway, and will create a file within that wrong-case folder—or perhaps destroy some other work-tree file that has the same name except for case, or whatever.
This last bit, where your work-tree files end up with the wrong names and/or in the wrong folders and/or end up overwriting similar files whose name differs in case somewhere, is a problem for you. It's not a problem for Git, though. Git just keeps using the index copies of each file. The next git commit you run uses whatever is in Git's index. The fact that your work-tree doesn't match is not a problem for Git. It's just a problem for you, because the normal everyday git add command means make the Git index entry for this file match the copy that's in my work-tree, and if that's the wrong copy, well, that's a problem.
In any case, once you have a correct commit in Git as your current commit, and extracted into Git's index, you can do whatever you like to your work-tree, including remove large swaths of it, or rename folders, or whatever. Get it set up however you like, then use git checkout or git restore to re-extract all or part of the current commit to your work-tree. Now that you've eliminated the name-case-issues in Git's commit and index, and cleaned up or removed any problematic files and/or folders in your work-tree, Git can create correct-case folders and/or files as needed. It's the process of getting the correct commit into Git that's painful, except on a case-sensitive file system.
1"Copies" is in quotes here because the files in Git's index—which Git also calls the staging area—are in a special Git-only format that de-duplicates content. When the copies that are in Git's index match the copies that are in some existing commit, Git is really just re-using the existing commit's files. Files with all-new content actually require a new internal blob object, which Git creates as needed; after that, the content will be de-duplicated as usual.

Git: don't update certain files on Windows

We work with a git respository that has over 20,000 files.
My group maintains local versions of about 100 or so of configuration and source files from this repository. THe original acts as a sort of base that several groups modify and tweak to their own needs (some core things are not allowed to be changed, but front end and some custom DB stuff are different between groups)
So we want to update to the latest version generally, but not have the git update overwrite the files that we keep local modifications for.
The machines we use are windows based. Currently the repository gets cloned to a windows server that then gets checked out/cloned to the development machines (which are also windows). The developers make changes as necessary and recommit to our local repo. The local repo updates against the master daily. We never commit back to the master.
So we want all the files that haven't been changed by our group to update, but any that have been changed (ever) won't get updated.
Is there a way to allow this to happen automatically, so the windows server just automatically updates daily, ignoring those files we keep modifications for. And if we want to add a new file to this "don't update" list its just a right-click (or even a flat file list away). I looked at git-ignore but it seems to be for committing, not for updating.
Even better would be a way to automatically download the vanilla files but have them renamed automatically. For example settings.conf is a file we want to keep changes on generally, but if they modify the way entries in that file are handled or add extra options it would be nice it it downloaded it as settings.conf.vanilla or something so we just run a diff on .vanilla files against ours and see what we want to keep. Though this feature is not absolutely necessary and seems unlikely.
If this cannot be accomplished on a windows machine (the software for windows doesn't support such features), please list some Linux options as well if available. We do have an option to use a Linux server for hosting the local git repo if needed.
Thanks.
It sounds like you're working with a third party code base that's under active development and you have your own customisations which you need to apply.
I think the answer you're looking for is rebase. You shouldn't need to write any external logic to achieve this, except for a job which regularly pulls in the third party changes and rebases your modifications on top of them.
This should also be more correct than simply ignoring the files you've modified, as you won't then accidentally ignore changes that the third party has made to those files (you may sometimes get a conflict, which could be frustrating, but better than silently missing an important change).
Assuming that your local repo is indeed simply a fork, maintain your changes on your own branch, and every time you update the remote repository, simply rebase your local branch on top of those changes:
git pull origin master
git checkout custom_branch
git rebase master
Edit
After you've done this, you'll end up with all the changes you made on your custom_branch sitting on top of master. You can then continue to make your customisations on your own branch, and development of the third party code can continue independently.
The next time you want to pull in the extra changes, you'll repeat the process:
Make sure you're on the master branch before pulling in changes to the third party code:
git checkout master
Pull in the changes:
git pull origin master
Change to your customised branch:
git checkout custom_branch
Move your changes on top of the third party changes:
git rebase master
This will then put all your own changes on top of master again. master itself won't be changed.
Remember that the structure of your repo just comes from a whole set of "hashes" which form a tree. Your branches are just like "post it" notes which are attached to a particular hash, and can be moved to another hash as your branch grows.
The rebase is like chopping off a branch and re-attaching it somewhere else. In this case, you're saying something like "chop off our changes and re-attach them on top of the main trunk".
If you can install a visual tool like GitX, it will really help to see how the branch tags move around when you rebase. The command line is ideal for working with but I find something like GitX is invaluable for getting a handle on the structure of your repo.

Best practices for Xcode + Git for multi-developer projects

I can create a repo and use GitHub / BitBucket fine for my own projects. I have had problems when collaborating with other developers or trying to fork a project on GitHub.
I am aware of other answers like Best practices for git repositories on open source projects but there are OSX / Xcode specific problems I want to know how to solve.
.DS_Store files can be a pain. You can use .gitignore to prevent, but what happens if they have already been included, or another developer adds them back in through a clumsy git command?
The .xcodeproj will have changes to the directory names and developer profiles for the other person. What's the best way to do merges or to avoid conflicts?
If I have forked or pulled from a github project, how can I clean up these issues and also minimise merge conflicts for the maintainer?
If people have an example .gitignore created for Xcode, or scripts they use to initialise their repos then that would be great!
Put .DS_Store in .gitignore. Then, if you haven't already, add .gitignore to the repo. (You should not ignore .gitignore.) Now all developers will ignore .DS_Store files. If any were added to the repo erroneously before you put .DS_Store in .gitignore, you can now remove them (in a commit) and they should stay out.
The xcodeproj is a directory. The only file in this directory that must be in the repository is the project.pbxproj file. I generally ignore all of the others by putting these lines in my .gitignore:
*.xcuserstate
project.xcworkspace/
xcuserdata/
You should avoid putting absolute paths in your build settings. Use relative paths.
Your Debug and Release builds should use iPhone Developer as the code signing identity, so that Xcode will automatically select the local developer's profile. When you want to create an IPA (for distribution), Xcode will offer to re-sign it with a different identity, at which point you can choose your distribution profile if you need to.
If you're trying to use a project from github that has made these mistakes, you can try to get the maintainer to fix them, or you can make sure you don't touch the .DS_Store files and the code signing identities in the same commits that you want to send upstream.
For the 2nd issue regarding the .xcodeproj and merge conflicts.
Using a .gitattributes file to specify that merge conflicts for all .pbxproj files should be handled using the merge=union strategy, which should mean that Git knows to merge in the changes from both sides of the conflict, taking the upstream changes first.
This article explains it in a bit more depth
I'll try one by one:
I. You need to use git filter-branch only if you need to remove the files from your history completely. If those files do not contain any credit card information, then i think the following should be enough:
git rm --cached .DS_Store
git commit -m "{Your message}"
then add this file to .gitignore and commit it.
This will commit the removal of the file from the repository but will keep the file in working directory. If you push it though and then somebody else will pull this commit, they might have their file removed, so you MUST communicate this.
By committing .gitignore you will prevent other developers from adding this file again.
If you're not a maintainer, then i don't think you should do anything, but address this issue to the maintainer.
II. I'm a strong believer that hidden files of any nature are most of the time not supposed to be put into the repository exactly for that reason. Therefore i think that you should do the same thing with .xcodeproj as with .DS_Store and put it into .gitignore and commit it. .gitignore is the exception for the rule above.
III. If those files are properly ignored , then there will be no issues in future with them. If they are already in the repo and somebody wants do such cleanup it should be done by maintainer and communicated inside the team.
Hope that helps!
git filter-branch might help you to remove unwanted files (.DS_Store files) from your repository -- see e.g. https://help.github.com/articles/remove-sensitive-data
If a clumsy git commit has added files you should be able to replay the corrected changesets onto a clean repository.
You're right in the sense that if a .DS_Store is already added the .gitignore won't be of much help however I think this is still a good resource for you and others.
When I start a project, I normally look at this list to see if there is a good .gitignore already existing. More specifically for you, this one is the Objective-C .gitignore.
Hopefully those resources are of some use.
As a Mac user you should download a tool like SourceTree which supports Git Flow. Git Flow will help you establish some best practices around how your collaborators will commit code to the repo and at the very least make merge conflicts less frequent and more manageable. For a set of gitignore files for various project types you can go to GitHub and download one that is ready to go. For Xcode they have it listed as Objective-C.gitignore. That is a good starting place and it even covers Cocoapods. If you're using external libraries, your project should use CocoaPods so that you can isolate that code and keep it outside of your repo and avoid git submodules.
Now when you find a file has made it into your repo like .DS_Store just remove it, and move on. Make sure you add it to the .gitignore file that is checked into the project.
As for xcodeproj... there shouldn't be that much customization within the file that is user specific since the above mentioned gitignore filters that out. If a scheme is to be shared make sure you check shared under Manage Schemes and you will check in files in that subdirectory. You should be using automatic selection of certificates so the only real choice is Developer or Distribution. You should also take advantage of variables provided within Xcode that avoid hardcoding complete paths. When trying to think of an example Plists came to mind, in this case, you might have written /Users/me/MyProject/Resources/MyProject.plist, but instead should use $(SRCROOT)/resources/MyProject.plist.

Make a SVN working folder identical to repository version

I basically want to do an SVN export as part of a scripted build process, but without having to get the entire repo from scratch every time, which is slow and eats bandwidth... not to mention will make testing the script a pain in the backside if it does this everytime we tweak something or spot a typo in the scripts.
Is there an obvious way to do an export into an existing directory, so only files that are different are fetched, and non-repo files are deleted, basically giving a clean export but done in a smart way?
Windows is preferred, but I guess Cygwin is an option.
I think the only way to get this done, is to checkout a working copy, and update & revert that. Updating a WC only gets the changes.
svn export doesn't know what files are changed, and to compare files, you first have to fetch all of them. Also it would be hard to get files that were deleted or renamed out of your 'export' directory.
Checkout a working copy, then export from your working copy.
SVN update on the working copy will then be quick and bandwitch light.
Then you can delete the original export and re-export from the working copy.
All the bandwidth hungry operations are optimized. The heavy handed delete and re-create is the same as it was before, but it's now all local, so should be much faster.
Also, you have the option to make changes in the exported working copy, but you might want to be careful with that and consider the impact of having conflicts occur during your svn update.
I am not sure if I understand your question right. To rephrase it. I think you would want to have the repo local copy updated on a regular basis. However you would want the working copy pristine so that the resulting build is a clean. Considering this is your question below is what I would suggest.
To my knowledge svn export might not be the be best option for this. Because the purpose of svn export is to obtain a unversioned working copy of the svn repo. As it is unversioned, svn client would not really know from where it has to start the update.
The best option i can think of is this. Checkout the copy of the repo (local copy, LC) in a location. This LC should be updated during the build process. Make a copy of the LC in a different location and use it for performing the build. Below are the commands you would require
1. svn update <arbitrary path>(in the working copy)
2. copy <arbitrary path> <build path>
3. find <build path> -type 'd' -name '.svn' (if you would like to remove the .svn hidden files, but they are not going to really hurt the build process)
Some Options for Eliminating the copy time from factoring in the build process time
If you would like to save the copy time during the build process probably you can do this copy operation after each build and svn update the copy just before building (assume the .svn folders are retained).
On linux two folders can be kept in sync using rsync. The build copy can be made to reflect the updates in the pristine copy.
In Windows, there are a few tools to achieve sync suggested above. I have not used them but I will provide you the links to try it yourself.
http://lifehacker.com/326199/synchronize-folders-with-synctoy-20
http://www.techsupportalert.com/best-free-folder-synchronization-utility.htm
Another option is to use checkout and revert / update but also use something like the SharpSvn library to make a script that will delete non-source controlled files. This way build artifacts like compiled code will be removed and the versioned files will be returned to base state by the revert / update.
If you have a lot of directories and files this scanning could be slow, but if you are confident about what directories will contain build artifacts would can just scan those.

Resources