Why does git diff fail to work on changes in subdirectories? - git-diff

Why does git diff fail to work on changes in subdirectories?
When I do a git diff *c and the C files are in subdirectories, git outputs an error message.
$ git diff *c
fatal: ambiguous argument '*c': unknown revision or path not in the
working tree. Use '--' to separate paths from revisions, like
this: 'git <command> [<revision>...] -- [<file>...]'
What option is required to make git diff to list changes in the subdirectories as well?

Looks like the command is missing quotes and they are not optional:
$ git diff -- '*.c'

Related

Bash filter out files with a given extension

I've added a pre-commit hook to run Rubocop against any files that are being staged for commit. However, Rubocop errors out when you give it a .png or .svg file. I've added those file extensions to the exclude block in .rubocop.yml but because I'm actually explicitly supplying files by name that doesn't seem to do the trick.
Here's my pre-commit script:
#!/usr/bin/env bash
set -e
git diff --staged --diff-filter=d --name-only | xargs bundle exec rubocop
I think the approach is grabbing the list of files from that git command and looping through them, filtering out any that are .png or .svg, but honestly I don't know how to do that. Any suggestions on filtering the files by extension?
One idea might be to filter out those extensions using grep.
git diff --staged --diff-filter=d --name-only \
| grep --invert-match '\.\(png\|svg\)$' \
| xargs bundle exec rubocop
There is a flag --only-recognized-file-types which I think will do what you need.
git diff --staged --diff-filter=d --name-only -- ':!*.png' ':!*.svg'
or for easier control you could check attributes and set up the matching just in .gitattributes.
Say git help glossary and hunt up pathspec, there's probably a quicker route to find the docs for it but that's the one I know.

How to use git filter-branch on pattern of folders

I've committed a bunch of sensitive data to my local repo that has not been published yet.
The sensitive data is scattered across the project in different folders and I want to remove all these completely from git history.
All of the concerning folders have the same name, and are at the same level in the directory in different folders. Following is a sample of my folder structure:
root
folder1
./sensitiveData
folder2
./sensitiveData
folder3
./sensitiveData
using the following command, I am able to delete the folders containing sensitive data one at a time:
git filter-branch -f --index-filter 'git rm -r --cached --ignore-unmatch javascript/folder1/.sensitiveData' --prune-empty HEAD
But I want to delete all the folders containing sensitive data in one go, because, they are too many, and I would like to learn how this works.
But using the following command, nothing is rewritten and I am warned that 'refs/heads/master' is unchanged is unchanged:
git filter-branch -f --index-filter 'git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData' --prune-empty HEAD
As I see it, there are two strategies:
Either my pattern is somehow wrong and I need to change it.
Or I should do some looping with bash.
Option one seems more sensible if possible.
Your command, when you run it, is first evaluated by your shell. So with:
'git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData'
the single quotes protect the entire thing from the shell, and pass it to git filter-branch as the --index-filter to be used later. The single quotes are gone at this point.
Here's the problem: filters given to git filter-branch get evaluated at filtering-time by another shell (technically, the shell that's running git filter-branch itself). This other shell evals the command:
eval $filter
So now this second shell re-interprets:
git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData
It breaks up the arguments at spaces, expands the asterisk based on the current working directory, and invokes git rm -r --cached --ignore-unmatched on the result of the expansion.
If the expansion succeeds, one thing happens; if not, something else happens. Precisely what happens depends on the shell (bash can be configured to behave in several different ways; POSIX sh is more predictable).
The actual current working directory for an --index-filter is generally empty so the expansion will probably fail. This should, in most cases, pass the asterisk on unchanged to Git. Since the argument to git rm is (mostly / essentially) a pathspec, Git will now do its own expansion. This should have worked, so either the path itself is wrong, or the directory is not empty, or there's something odd about your shell so that the failed expansion didn't pass the literal text javascript/*/.sensitiveData to git rm.
You can take some variables out of this equation by using:
'git rm -r --cached --ignore-unmatch javascript/\*/.sensitiveData'
so that the second shell sees:
git rm -r --cached --ignore-unmatch javascript/\*/.sensitiveData
which will force the second shell to pass:
javascript/*/.sensitiveData
directly to git rm. Given that this probably should have worked anyway, though, it's of interest to check whether javascript/*/.sensitiveData would match the right files in the specific commit(s), which you can do kind of clumsily / manually using git ls-tree -r on those commits.
At the end, what solved my problem was a small bash script using the for in construct.
for name in javascript/*/.sensitiveData
do git filter-branch -f --index-filter "git rm -r --cached --ignore-unmatch $name" --prune-empty HEAD
done

Search only for certain files with grep

I found out today about
git rev-parse --show-toplevel && git ls-files
Which searches the top directory of a gir repository for all tracked files. Is there a way that I can make grep respect its output and only search through those files?
I tried
git rev-parse --show-toplevel && git ls-files | grep -r "something"
but in my small tests showed that it piping wasn't actually working. It would behave the same as just a regular grep command.
I also tried (just as an example)
grep -r "something" --include=`git ls-files`
but I think that only works with single files, since it wasn't showing all possible matches
Your first assertion is incorrect, as the two commands are executed separately from one another (the second one only executed if the first one completes successfully).
I guess what you wanted in the first place was:
git ls-files "$(git rev-parse --show-toplevel)"
This passes the top-level directory as an argument to git ls-files.
To grep the list of files for something, you could use xargs:
git ls-files -z "$(git rev-parse --show-toplevel)" | xargs -0 grep 'something'
I've added the -z switch to ls-files and the corresponding -0 switch to xargs, so that both tools work with null-bytes in their input/output, which means that awkward characters in file names don't cause any problems.
I don't think that the -r switch is doing anything useful in grep, since the output of ls-files doesn't contain any directories (git only tracks files).

How to list only file names that have changed between two branches

I have two branches in my repository which I want to diff for some files.
I want to list only newly added migrations between those two branches.
Something like:
git diff branch1 branch2 | grep /db/migrate
How can I do it?
This command will diff their whole history:
git diff branch1..branch2 --name-only
If you want to compare from their last common ancestor, then:
git diff branch1...branch2 --name-only
And now you can grep files that you want. From there it's easy to write a little shell script that diffs two branches, file by file.
filenames=$(git diff branch1...branch2 --name-only | grep /db/migratons)
IFS=' '
read -r -a filearr <<< "$filenames"
for filename in "${filearr[#]}"
do
echo $(git diff branch1...branch2 -- "$filename")
done
Create the git-command-name file and put it into the user/bin folder (you should parametrize input - branches as variables).
Git will recognise it as a command that you can call with:
git command-name branch1 branch2
It's even easier if you want to compare the current branch to another. While those familiar with git will think this obvious, I'm including this for people that are starting out with git.
git diff other_branch_name --name-only

git add hook / getting git to ignore file modes of new files

I'm working with a git repository on both windows and linux/mac. When I create new files on windows, or edit them in some text editors, the file mode is changed to 775. I can get git to ignore file mode changes with
git config core.filemode false
but I also want most new files to have mode 664 (not 775). The best I've come up with so far is the pre-commit hook
git diff --cached --name-only | egrep -v '\.(bat|py|sh|exe)' | xargs -d"\n" chmod 664
git diff --cached --name-only | egrep -v '\.(bat|py|sh|exe)' | xargs -d"\n" git add
but this does the wrong thing if I've added a new file, then edited it again before commiting, and then commited without adding it. Is there a better way to do this, or something like a pre-add or post-add hook?
Edit: git diff --cached --name-only also gives me files that have been deleted, so what I really want is something like git diff --cached --name-only --diff-filter=ACMRTUXB
Instead of using chmod and readding, you can use git update-index --chmod=-x <files> to modify the index directly.
Take a look at git attributes. You can handle it there. Smudge/clean may be a way to deal with it.

Resources