Why does git keep messing with my line endings? - windows

I'm on Windows and I have core.autocrlf disabled:
$ git config core.autocrlf; git config --global core.autocrlf
false
false
Now, I would assume, that git does not mess with any line endings, but some files in my repo keep causing issues.
I just cloned a fresh copy of the repo, and git is already telling me that there are unstaged changes. When I git diff -R, I can see that CRLF line endings have been added to a file:
diff --git b/nginx-1.11.1/contrib/geo2nginx.pl a/nginx-1.11.1/contrib/geo2nginx.pl
index bc8af46..29243ec 100644
--- b/nginx-1.11.1/contrib/geo2nginx.pl
+++ a/nginx-1.11.1/contrib/geo2nginx.pl
## -1,58 +1,58 ##
-#!/usr/bin/perl -w
-
-# (c) Andrei Nigmatulin, 2005
+#!/usr/bin/perl -w^M
+^M
+# (c) Andrei Nigmatulin, 2005^M
I don't understand where these line endings come from, but I'm also unable to "revert" this change. When I checkout the file again, it will still be modified afterwards:
$ git checkout -f nginx-1.11.1/contrib/geo2nginx.pl
$ git status
On branch dev
Your branch is up-to-date with 'origin/dev'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: nginx-1.11.1/contrib/geo2nginx.pl
no changes added to commit (use "git add" and/or "git commit -a")
This makes no sense to me, so I just run dos2unix on the file:
$ dos2unix nginx-1.11.1/contrib/geo2nginx.pl
dos2unix: converting file nginx-1.11.1/contrib/geo2nginx.pl to Unix format...
Now there surely shouldn't be any changes, right? But the file is still being shown as modified in git status and git diff will still report CRLF line endings in the working copy.
When I now stage and commit the file, the resulting file will have LF line endings, even though the diff showed CRLF line endings.
I don't have a global .gitattributes (git config --global core.attributesfile does not output anything). And the .gitattributes in the project has * text eol=lf set (full .gitattributes).
What is going on here and how can I resolve this?
Reproduce the issue
I can repro this issue with an open source project I'm maintaining:
$ git clone git#github.com:fairmanager/fm-log.git
Cloning into 'fm-log'...
remote: Counting objects: 790, done.
remote: Total 790 (delta 0), reused 0 (delta 0), pack-reused 790
Receiving objects: 100% (790/790), 201.71 KiB | 138.00 KiB/s, done.
Resolving deltas: 100% (418/418), done.
Checking connectivity... done.
$ cd fm-log/
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: .idea/dictionaries/OliverSalzburg.xml
modified: .idea/inspectionProfiles/Project_Default.xml
modified: .idea/inspectionProfiles/profiles_settings.xml
no changes added to commit (use "git add" and/or "git commit -a")

Interesting: with the newly added-to-question repo I tried cloning it and, on BSD, see the same thing:
$ git clone git#github.com:fairmanager/fm-log.git
Cloning into 'fm-log'...
remote: Counting objects: 790, done.
remote: Total 790 (delta 0), reused 0 (delta 0), pack-reused 790
Receiving objects: 100% (790/790), 201.71 KiB | 0 bytes/s, done.
Resolving deltas: 100% (418/418), done.
Checking connectivity... done.
$ cd fm-log/
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: .idea/dictionaries/OliverSalzburg.xml
modified: .idea/inspectionProfiles/Project_Default.xml
modified: .idea/inspectionProfiles/profiles_settings.xml
no changes added to commit (use "git add" and/or "git commit -a")
Trying to see what's going on:
$ git diff
warning: CRLF will be replaced by LF in .idea/dictionaries/OliverSalzburg.xml.
[snip]
It seems that HEAD:.idea/ files actually have carriage-returns in them (and the last line has no newline, hence the prompt right after the close angle bracket):
$ git show HEAD:.idea/dictionaries/OliverSalzburg.xml | vis
<component name="ProjectDictionaryState">\^M
<dictionary name="OliverSalzburg">\^M
<words>\^M
<w>colorizer</w>\^M
<w>multiline</w>\^M
</words>\^M
</dictionary>\^M
</component>$
The work-tree version likewise has carriage returns (no cut and paste but vis shows the same \^M line endings).
So what has happened in this case, at least, is that due to the .gitattributes setting of:
$ head -2 .gitattributes
# In general, use LF for text
* text eol=lf
Git will convert these files to LF-only during commit, vs the HEAD version that contains CR-LF endings. This is what git status is saying here.
Commenting out the * text eol=lf in .gitattributes makes the status go away (since the files won't be converted), though of course .gitattributes is now marked as modified. Interestingly, putting the attribute line back again, the status goes completely silent: it's necessary to force git checkout to replace the work-tree version to get the status back (e.g., manually rm -rf .idea and check out again, or git reset --hard).
(Presumably—I'm guessing a bit here—the index entry gets marked specially during git checkout when Git notices that the work-tree and HEAD version differ. This makes Git inspect the file closely. Modifying .gitattributes and running an internal diff via git status probably un-marks the index entries. This part is pure speculation meant to explain the weird behavior of git status...)

If the local setting git config core.autocrlf is indeed to false, then those eol changes must come from a .gitattributes file (see man page).
Look for one in your local repo, which would include eol rules (like * text=auto eol=crlf I mentioned here).
Or check if you have a global gitattributes file.
git config --global core.attributesfile
Regarding fairmanager/fm-log, you can see one of the "modified" files .idea/dictionaries/OliverSalzburg.xml added in commit e6f823b with crlf at the end.
Since the .gitattributes has * text eol=lf rule, the working directory checkout blobs with lf eol, hence the git diff.

Related

How to avoid problem with clean/smudge that prevents git from removing changes with `git restore <file>`?

Changes mysteriously appeared in a file (lib/spo_api.rb) when I switched to master, and I could not get rid of the changes with git restore to switch back. The shell history says it all more explicitly, but skip down to the bottom if you already understood.
💻 git status
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: lib/spo_api.rb
no changes added to commit (use "git add" and/or "git commit -a")
💻 git diff
diff --git a/lib/spo_api.rb b/lib/spo_api.rb
index 4af1ffb..54628c4 100644
--- a/lib/spo_api.rb
+++ b/lib/spo_api.rb
## -22,4 +22,4 ## class SpoApi
{ success: res.success?, data: data }
end
-end
\ No newline at end of file
+end
💻 git restore lib/spo_api.rb
💻 git status
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: lib/spo_api.rb
no changes added to commit (use "git add" and/or "git commit -a")
You can see that the first and last git status invocations are the same, and that git restore lib/spo_api.rb does nothing. 😱
I tracked it down to something in my .gitconfig:
[filter "bindingpry"]
clean = sed '/^\\s*-?binding.pry$/'d
smudge = cat
This is here to prevent accidentally leaving binding.pry (for debugging in Ruby) in a codebase, and it works perfectly as long as all files have a final newline, as they should. Of course I can't control the behavior of every developer who goes before me, or who works in the same repo concurrently with me, and while of course I have plans to clean up such files in repos I work in, it would also be nice if I didn't get shot in the foot by my own safety measures.
What would make the above filter work as I want it to? (I'm using zsh but I imagine 98% of what works in Bash will work for me too.)

Git check out a specific file type with LF line endings on Windows

On Windows, I would like to check out all linux shell files (.sh) with LF line endings
Line Endings of other text-based files should be converted to CRLF. (which is handled via the global core.autocrlf=true)
global .gitconfig
[core]
editor = 'C:/Tools/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin
autocrlf = true
.gitattributes in repository root folder
*.sh text eol=lf
I used git add --renormalize . when adding the .gitattributes file.
Unfortunately the .sh files still have CRLF after checkout.
additional information: One of my team members did change his global core.autocrlf=false some commits ago, which caused the chaotic line endings, I guess.
With above mentioned steps I could at least fix files of the local repository to have CRLF endings again.
steps tried:
delete files locally and checkout again: no affect - all CRLF
delete files, push deletion, recreate files with LF: still CRLF after checkout
manually change line endings with Notepad++...
user#workstation MINGW64 /c/repos/project-source (bug_sh_files_eol_lf)
$ git status
On branch bug_sh_files_eol_lf
Your branch is up to date with 'origin/bug_sh_files_eol_lf'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: production_code/build_production_code.sh
modified: test_code/unit_tests/create_unit_test_xml.sh
no changes added to commit (use "git add" and/or "git commit -a")
user#workstation MINGW64 /c/repos/project-source (bug_sh_files_eol_lf)
$ git add . -u
warning: LF will be replaced by CRLF in production_code/build_production_code.sh.
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in test_code/unit_tests/create_unit_test_xml.sh.
The file will have its original line endings in your working directory
user#workstation MINGW64 /c/repos/project-source (bug_sh_files_eol_lf)
$ git status
On branch bug_sh_files_eol_lf
Your branch is up to date with 'origin/bug_sh_files_eol_lf'.
nothing to commit, working tree clean
user#workstation MINGW64 /c/repos/project-source (bug_sh_files_eol_lf)
nevermind, I had a typo in my .gitattributes file name
nevertheless, the solution:
fix .gitattributes
# normalize all introduced text files to LF line endings (recognized by git)
* text=auto
# additionally declare text file types
*.sh text eol=lf
*.c text
*.h text
*.txt text
*.yml text
call git add --renormalize . to fix line endings of files with CRLF in repository

Why isnt git ignoring my sub directory?

my .gitignore file
ext/templates_c
my git status call
D:\Development\online\site\newsite>git status
# On branch master
# Changes not staged for commit:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: ext/pages/config.php
# modified: ext/templates_c/60a4cccd667e8b1e3a702b2a2c9108f056837adc.file.pages.html.php
# modified: ext/templates_c/fd38ffaa13c6f4c29772bec22cad5aebb1d4d7f6.file.form.html.php
#
no changes added to commit (use "git add" and/or "git commit -a")
Have I done something really stupid?
Why isnt git ignoring the files in ext/templates_c
The fact that git status shows files in your "ignored" subdirectory as "modified" means that those files are already being tracked by git. Because of this, simply adding the directory to .gitignore is not sufficient to get those files ignored (although new files in that directory will properly be ignored). You need to do a git rm --cached <file> for each of the files in that directory that are currently tracked.
As I was writing this I discovered that
ext/templates_c
does not work but
ext\templates_c
does.
So you need to use windows slashes.

Gitignore doesn't work properly

I have follwoing dir structure
and here is content of .gitignore
conf.php
The problem is, when I change contents conf.php and try to commit something, GIT detects conf.php as changed file again. In fact it must be ignored.
I have commited it once. I want to ignore it from now. What am I missing?
Update
I tried following:
Added conf.php to gitignore
removed it from cache git rm --cached conf.php.
But now, when I rescan project, it wants to stage conf.php as removed.
Its not what I want.
I want to keep it on remote repo and ignore future (from now) changes in local repo.
Git will not ignore tracked files
Git will not ignore changes to already-committed files.
What Git ignore is for
Git ignore, ignores noise that's in your repo, for example:
$ git init
$ mkdir tmp
$ touch tmp/file1
$ touch tmp/file2
$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# tmp/
nothing added to commit but untracked files present (use "git add" to track)
If the .gitignore file is modified to ignore the tmp directory:
$ echo "tmp" > .gitignore
$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# .gitignore
nothing added to commit but untracked files present (use "git add" to track)
the tmp dir contents are nolonger listed - but the .gitignore file IS listed (because I just created it and the file itself is not ignored).
Committing that, there are no changes listed:
$ git add .gitignore
$ git commit -v -m "addding git ignore file"
[master (root-commit) d7af571] addding git ignore file
1 file changed, 1 insertion(+)
create mode 100644 .gitignore
$ git status
# On branch master
nothing to commit (working directory clean)
Now any changes to the .gitignore file will show up as pending changes, any new files outside the tmp dir will show up as untracked files and any changes inside the tmp dir will be ignored.
Don't commit files that you don't want to track
If you've already added conf.php to your git repo, git is tracking the file. When it changes git will say that it has pending changes.
the solution to such things is to NOT commit files you don't want to track. Instead what you can do though, is to commit a default file, e.g.:
$ mv conf.php conf.php.default
$ # Edit conf.php.default to be in an appropriate state if necessary
$ git commit -v -m "moving conf.php file"
$ cp conf.php.default conf.php
Now any changes you make to conf.php - will not show up as unstaged changes.
if you want to add a empty config file to your project and then not track any additonal changes then you can do this:
edit conf.php to be in the state you want it to stay
add and commit the config.php
run the following git command:
git update-index --assume-unchanged conf.php
You will no longer see conf.php as having pending changes
To start tracking changes again, run this command:
git update-index --no-assume-unchanged conf.php
Just run git rm --cached conf.php
If you have more commited files to ignore:
git rm -r --cached .
git add .
Then commit and push your changes.

how do I clone files with colons in the filename

When I clone the repo using msysgit, all the files with spaces in the filename are not brought down, and then show as deleted in the status.
The filenames looks something like this: styles-ie (1:12:11 6:02 PM).css so it might actually be the colon or brackets?
How can I fetch those files to bring my local repo inline with the origin?
Good news.
Technically, the answer to "how do I clone files with colons in the filename" is to simply use "git clone". Luckily it is only the checkout that fails on Windows (even under msysgit) and there is a rather clean workaround for this shown below.
TL;DR
in Git Bash...
git clone {repo URL}
cd {repo dir}
git ls-tree -r master --name-only | grep -v ":" | xargs git reset HEAD
git commit -m "deleting all files with a colon in the name"
git restore .
... and then
download the Zip of the whole git repo
rename files with colons inside the Zip (without extracting them)
extract just those files you renamed
add those renamed files to your working directory
For insight into those few steps listed above, please keep reading....
I was able to work around this issue while working with a repo with colons in various filenames. The following worked for me:
Do a regular git clone.
$ git clone https://github.com/wdawson/dropwizard-auth-example.git
You should see the following error that notes that the clone succeeded, but the checkout failed.
Cloning into 'dropwizard-auth-example'...
remote: Enumerating objects: 322, done.
remote: Total 322 (delta 0), reused 0 (delta 0), pack-reused 322
Receiving objects: 100% (322/322), 15.00 MiB | 2.88 MiB/s, done.
Resolving deltas: 100% (72/72), done.
error: invalid path 'src/test/resources/revoker/example-ca/certs/root.localhost:9000.cert.pem'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
Change directories to the new cloned repo
cd dropwizard-auth-example
Check that the git repo working directory is completely empty
ls
Run git-status to find that all the files are staged for deletion
$ git status
Output...
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: .gitignore
deleted: .travis.yml
deleted: LICENSE
deleted: NOTICE
deleted: README.md
deleted: conf.yml
...
Revert the staged deletion of only the files that do not contain a colon in the file name.
$ git ls-tree -r master --name-only | grep -v ":" | xargs git reset HEAD
Output...
Unstaged changes after reset:
D .gitignore
D .travis.yml
D LICENSE
D NOTICE
D README.md
D conf.yml
D java-cacerts.jks
D pom.xml
D src/main/java/wdawson/samples/dropwizard/UserInfoApplication.java
D src/main/java/wdawson/samples/dropwizard/api/UserInfo.java
D src/main/java/wdawson/samples/dropwizard/auth/OAuth2Authenticator.java
D src/main/java/wdawson/samples/dropwizard/auth/OAuth2Authorizer.java
D src/main/java/wdawson/samples/dropwizard/auth/Role.java
...
Run git status again to see that only the files that contain a colon in the file name are now staged for deletion. All other files are still showing as deleted, but not staged for commit. This is what we want at this stage.
$ git status
Output...
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: src/test/resources/revoker/example-ca/certs/root.localhost:9000.cert.pem
deleted: src/test/resources/revoker/example-ca/csr/root.localhost:9000.csr.pem
deleted: src/test/resources/revoker/example-ca/intermediate/certs/intermediate.localhost:9000.cert.pem
deleted: src/test/resources/revoker/example-ca/intermediate/csr/intermediate.localhost:9000.csr.pem
deleted: src/test/resources/revoker/example-ca/intermediate/private/intermediate.localhost:9000.key.pem
deleted: src/test/resources/revoker/example-ca/private/root.localhost:9000.key.pem
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
deleted: .gitignore
deleted: .travis.yml
deleted: LICENSE
deleted: NOTICE
deleted: README.md
deleted: conf.yml
deleted: java-cacerts.jks
deleted: pom.xml
Commit all the staged files. That is, commit the deletion of all the files that contain a colon in the file name.
git commit -m "deleting all files with a colon in the name"
Restore everything in the working directory.
$ git restore .
View all the files. What a beautiful site.
$ ls
Output...
conf.yml java-cacerts.jks LICENSE NOTICE pom.xml README.md src
Once you've deleted the offending files from your working directory...
download a Zip of the whole GitHub repo
open it up in 7Zip... Don't unzip it ... just open it for editing (to rename files)
find the files that have a colon in the name
rename each file with a colon replacing the colon with an underscore...or whatever is appropriate
now you can extract those files you just renamed
copy them into the git working directory
PS: All of the above was done in GitBash on Windows 10 using git version 2.25.1.windows.1. Similar steps can be done via the GUI using TortoiseGit on Windows.
If you try doing:
touch "styles-ie (1:12:11 6:02 PM).css"
you will see that you cannot create it on Windows.
Basically, the repo has the file ( the blob and the tree entry ) but you cannot checkout on Windows as git would be unable to create such a file. No other way but to change the filename.
You can clone the repo on a linux environment, tar it up and copy it to windows, and untar it on windows with tools such as 7zip. 7zip will replace the colon with underscore, and preserve all the git information. As long as that file does not change, you'll be all set for a while. Those files tend not to change much anyway (for example, I have a cert file with a colon in the middle).
In support to the answers "using WSL" or "using Linux environment":
Using WSL:
(Windows 11)
1. Enable virtualization:
in BIOS
in Windows ("Turn Windows features on or off" -> "Virtual Machine Platform"/"Windows Subsystem for Linux" -> check)
2. Download and install linux distibutive (e.g. Ubuntu - latest):
in PowerShell:
wsl --install -d Ubuntu
3. Clone repo in WSL linux console
After WSL has been installed - run the application "WSL" - there going to be a linux console available. In that linux console - clone repository as you would normally do**.
** In my case I logged in as root (>sudo su), created ssh keys, added public ssh key to the github repo, navigated to required directory and cloned ssh repo.
As a result, through WSL console I'm able to see files with ":".
Through another file managers, consoles (File Explorer, PowerShell, cmd, git CLI) - in place of colons different symbols displayed.

Resources