Check out git repository that contains invalid filenames in windows [duplicate] - windows

I'm working a shared project using git for version control. I'm on windows while my partner is on Unix.
My partner has named some files with <file1>.txt. When I try to pull these files they are not accepted as the < and > are invalid characters for Windows. This is fine, I don't need to touch the files. However, they are added to my commit as deleted. So, if I push then I'll delete these files which I don't want to do.
I can't use git reset --hard as it finds an invalid path for each of these "deleted" files.
Is there a way to exclude these files from my commits? I've tried adding <file1> to my .git/info/exclude but that didn't work.

You would need to get your partner to change the names to be something that is also valid on Windows. After they have renamed them, what I'd do is this:
Backup any changes that you only have locally (both uncommitted AND committed but not pushed).
Run git reset --hard <commit> where <commit> is any commit from before the files were added.
Run git pull to get all the way to the latest revision (where the files are renamed).
Restore your backed up changes from 1.
This should then get the newer revision where the files aren't named in this, to Windows, illegal way, and they won't be deleted (or ever created) from under git by the OS :)
P.S. I know this is an old question, but I've been getting this issue recently, so hopefully the solution I've arrived at can help others as well.
EDIT:
To avoid this happening again, your partner can add a pre-commit hook that will stop them from committing files with names that would not be allowed on Windows. There's a sample/example often in pre-commit.sample. I've changed the bit a little in the past and end up with something like:
# Cross platform projects tend to avoid non-ASCII filenames; prevent
# them from being added to the repository. We exploit the fact that the
# printable range starts at the space character and ends with tilde.
if [ "$allownonascii" != "true" ] &&
# Note that the use of brackets around a tr range is ok here, (it's
# even required, for portability to Solaris 10's /usr/bin/tr), since
# the square bracket bytes happen to fall in the designated range.
echo $(git diff --cached --name-only --diff-filter=A -z $against | LC_ALL=C)
test $(git diff --cached --name-only --diff-filter=A -z $against |
LC_ALL=C tr -d '[ !#-)+-.0-9;=#-[]-{}~]\0' | wc -c) != 0
then
cat <<\EOF
Error: Attempt to add a non-ASCII file name.
This can cause problems if you want to work with people on other platforms.
To be portable it is advisable to rename the file.
If you know what you are doing you can disable this check using:
  git config hooks.allownonascii true
EOF
exit 1
fi
The '[ !#-)+-.0-9;=#-[]-{}~]\0' bit is the important part that I've changed a little. It defines all the allowed ranges of characters, and the example one only disallows "non-ascii" characters (which is what the comment at the top says), but there are also ascii characters that are not allowed in file names on Windows (such as ? and :).
All the allowed characters are removed, and if there's anything left (wc -c != 0) it errors. It can be a bit difficult to read, as you can't see any of the disallowed characters. It helps if you have a list of the char ranges to look at when reading or editing it.

Ignoring doesn't help if the files are tracked already.
Use Sparse checkout to skip those files.

Related

git branch command works fine as a cli command, but fails when run from loop or script using variables

In creating setup scripts, I have several git repos that I clone locally. This is done through a temporarily available proxy that may or may not be available later on, so I need to create all the remote branches from the remote repo as local branches that can be switched to. I have a method to extract the names of the remote repos that I want, when get stored as
[user]$ nvVar=$(git branch -r | grep -v '\->' | grep -Ev 'master|spdk\-1\.6' | cut -d'/' -f2)
This gives me variable list that can be iterated through, containing the branches I need to bring down.
[user]$ echo "$nvVar"
lightnvm
nvme-cuse
spdk
If I were doing all this manually, I would use commands like:
[user]$ git branch --track lightnvm origin/lightnvm
Branch lightnvm set up to track remote branch lightnvm from origin.
Which works fine...
But when I try to loop through the variable using shell expansion, I get a failure.
(FYI, if I put quotes around $nvVar, it doesn't iterate, and just tries running the whole string and fails. I have also tried to do this with an array, which also doesn't work, as well as using a while loop using the filtered output from git branch -r)
[user]$ for i in $nvVar; do git branch --track "${i}" "origin/${i}"; done
Which is supposed to produce the following git commands:
git branch --track lightnvm origin/lightnvm
git branch --track nvme-cuse origin/nvme-cuse
git branch --track spdk origin/spdk
Which seem to be identical to the same command typed in manually.. but instead, I get these errors:
fatal: 'lightnvm' is not a valid branch name.
fatal: 'nvme-cuse' is not a valid branch name.
fatal: 'spdk' is not a valid branch name.
Which makes no sense...
OS: RHEL 7.6
Git Version: 1.8.3.1
Bash Version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
(Edit) Apparently I have some special characters being captured that are messing up the command.
there's a " ^[[m " being appended to the captured variable... Really not sure how to get rid of that without hard-coding the commands, which I had hoped to avoid
Figured out a solution:
echo '#!/bin/bash' > gitShell
git branch -r | grep -v '\->' | grep -Ev 'master|spdk\-1\.6' | cut -d'/' -f2 | while read remote; do
echo "git branch --track ${remote} origin/${remote}" >> gitShell
done
cat -v gitShell | sed 's/\^\[\[\m//g' > gitShell1
if /bin/bash -ex gitShell1; then
echo 'Git repos branched'
rm gitShell
rm gitShell1
fi
I simply push the output to a file, then use cat -v to force the hidden characters to get displayed as normal characters, then filter them out with sed, and run the new script.
It's cumbersome, but it works. Apparently git returns "private unicode characters" in response to remote queries.
Thanks to #Cyrus for cluing me in to the fact that I had hidden characters in the original variable.
The git branch command is not meant for writing scripts. The problem is occurring because you have color-changing text strings embedded within the branch names. For instance, ESC [ 3 1 m branch ESC [ m spells out "switch to green, print the word branch, and stop printing in green". (The git branch command uses green by default for the current branch, which is not the interesting one here, but still emits various escape sequences for non-current-branch cases.)
You should be using git for-each-ref instead of git branch. This is what Git calls a plumbing command: one that is meant for writing scripts. Its output is meant to be easily machine-parsed and not contain any traps like color-changing escape sequences. It also obviates the need for some of the subsequent tricks, as it has %(...) directives that can be used to strip the desired number of prefixes from items.
(Alternatively, it's possible to use git branch but to disable color output, e.g., git -c color.branch=never branch. But git branch does not promise not to make arbitrary changes to its output in the future, while git for-each-ref does.)
You might also consider attacking the original problem in a different way: create a "mirror clone", but then once the clone is done, rewrite the fetch refspec and remove the mirror configuration. The difference between a regular clone and a mirror clone is, in short, that a regular clone copies all1 the commits and none2 of the branches, but a mirror clone copies all of the commits and all of the branches and all the other references as well,3 and sets remote.remote.mirror to true.
1Well, most of the commits, depending on which ones are reachable from which refs. If some objects are hidden or only findable via reflogs, you don't normally get those—but you don't normally care either, and in fact it's often desirable, e.g., after deleting an accidentally-committed 10 GB database.
2After copying the commits, a regular fetch turns branch names into remote-tracking names (origin/master for instance). The final git checkout step creates one new branch name, or if -n is given a tag name, doesn't.
3As with the "all commits", this is a sort of polite fiction: the sender might hide certain refs, in which case you don't get those refs, and presumably don't get those commits and other objects either. On the other hand, optimizations to avoid repacking might accidentally send unneeded objects: that 10 GB database might come through even when you didn't want it. Fortunately reflogs aren't refs, so this generally shouldn't happen.

View NUMBER of local uncommitted files

I've learned I can use count to find the NUMBER of commits a branch is ahead/behind by, like so:
git rev-list --count HEAD..#{u}
But is there a way to do so for uncommitted files?
Just found out git status -suno shows how many files have been changed in a really concise way, so I could either count the lines of the output (with echo "$var" | wc -l) or just put a symbol to denote an arbitrary amount exist, or parse it in a weird way to see the number of deleted/added/modified.
However, do non "porcelain" and more directly-addressing commands exist to accomplish this task, as parsing commands such as these are seen as bad practice?
Also, I am using this to add to a git-bash prompt; I would normally just type in git status, but would like to have maximum convenience by just showing such.
Ironically, the --porcelain option of git status is meant to be parsed:
git status --porcelain -suno|wc -l
So while git status is porcelain, git status --porcelain does produce output suitable for consumption by porcelain scripts.
I tried to explain said option in "What does the term “porcelain” mean in Git?"

Adding other useful info to a git archive filename automagically

Stumbled across this gem: Export all commits into ZIP files or directories whose inital answer met my needs for exporting commits from certain branches (like develop for example) into separate zip files - all done via a simple, yet clever, one-liner:
git rev-list --all --reverse | while read hash; do git archive --format zip --output ../myproject-commit$((i=i+1))-$hash.zip $hash; done
In my version I replaced the --all with --first-parent develop.
What I would like to do now is make the filenames more useful by including the commit date and commit author in the filename. I've Googled around a bit, grokked the git archive documentation, but do not seem to find any other 'parameters' I could use that are readily available like $hash.
I'm guessing I will need to expand the loop and call up the relevant bits individually, save them into bash variables and pass them on to the output option with something like ${author}, unless anyone else knows a cleaner, simpler way to do this, or can point me to documentation or other examples where I could pull the needed info from other parts of git? Thanks in advance for any insights.

list files with git status

Background
I am well aware of how git status works, and even about git ls-files. Usually git status is all I need and want, it perfectly answers the question: "What is my status, and what files need my attention?"
However, I have been unable to find a quick command that answers the following question: "What files do I have, and what is their respective status?" So, I need a full listing of the directory (like ls -la) with a column that shows the status of each file/directory.
What I have tried
git status -s --ignored comes quite close to the output format that I want, but it just won't list the files that are unchanged between HEAD, index, and working directory. Also, it will recurse into directories.
git ls-files seems to be able to provide all the required info in scriptable form, but I've been unable to stop it from recursive listing the contents of all directories.
Obviously, I could hack something together that takes the output of these two commands and provides the view I would like to have. However, I would hate to reinvent the wheel if there is already some usable command out there.
Question
Is there some way of listing all files in a directory with their respective git status?
I want a full listing showing exactly the same files that ls would show.
Notes
This other question does not answer mine, because I definitely want an ls equivalent. Including unmodified, ignored, and untracked files, but excluding directory contents.
To restrict the paths Git inspects to just the current directory, use its Unix glob pathspecs. Since git status does a lot of checking against the index and against HEAD, use that, and to fill in the rest of the files ls would show you, use ls, just munge its output to have the same format as git status's output and take only the ones git status didn't already list.
( git status -s -- ':(glob)*'; ls -A --file-type | awk '{print " "$0}' ) \
| sort -t$'\n' -usk1.4
:(glob) tells Git the rest of the pathspec's a Unix glob, i.e. that * should match only one level, just like a (dotglob-enabled) shell wildcard¹.
The -t$'\n' tells sort that the field separator is a newline, i.e. it's all one big field, and -usk1.4 says uniquify, only take the first of a run, stable, preserve input order where it doesn't violate sort key order (which is a little slower so you have to ask for that specifically), k1.4 says the key starts at the first field, the fourth character in that field, with no end given so from there to the end.
¹ Why they decided to make pathspecs match neither like shell specs nor like gitignore specs by default, I might never bother learning, since I so much prefer ignorantly disapproving of their annoying choice.
Because output of git status -s is enough, let's just make a bash routine around this function! Then for unchanged files we could to echo the proper signaling manually. Following the specification we might use two symbols of space ' ' for this purpose either some another symbol. E.g. for directories, which are not tracked by Git anyway, selected symbol '_' as status code:
for FILE in *
do
if [[ -f $FILE ]]
then
if ! [[ $(git status -s $FILE) ]]
then
# first two simbols below is a two-letter status code
echo " $FILE"
else
git status -s "$FILE"
fi
fi
if [[ -d $FILE ]]
then
# first two symbols just selected as status code for directories
echo "__ $FILE"
fi
done
The script works in the same manner as ls. It can be written in one line using ; as well.

Git diff without comments

Let's say I added two lines to the file hello.rb.
# this is a comment
puts "hello world"
If I do git diff, it will show that I added two lines.
I don't want git to show any line which is a Ruby comment. I tried using git diff -G <regular expression>, but it didn't work for me. How do I do git diff so that it won't show any Ruby comments?
One possibility would be (ab)using git's textconv filters which are applied before the diff is created, so you can even transform binary formats to get a human-readable diff.
You can use any script that reads from a file and writes to stdout, such as this one which strips all lines starting with #:
#!/bin/sh
grep -v "^\s*#" "$1" || test $? = 1
(test $? = 1 corrects the exit code of grep, see https://stackoverflow.com/a/49627999/4085967 for details.)
For C, there is a sed script to remove comments, which I will use in the following example. Download the script:
cd ~/install
wget http://sed.sourceforge.net/grabbag/scripts/remccoms3.sed
chmod +x remccoms3.sed
Add it to your ~/.gitconfig:
[diff "strip-comments"]
textconv=~/install/remccoms3.sed
Add it to the repository's .gitattributes to enable it for certain filetypes:
*.cpp diff=strip-comments
*.c diff=strip-comments
*.h diff=strip-comments
The main downside is that this will be always enabled by default, you can disable it with --no-textconv.
Git cares a lot about the data that you give to it, and tries really hard not to lose any information. For Git, it doesn't make sense to treat some lines as if they haven't changed if they had.
I think that the only reasonable way to do this is to post-process Git output, as Dogbert had already shown in his comment.

Resources