How to list only file names that have changed between two branches - git-diff

I have two branches in my repository which I want to diff for some files.
I want to list only newly added migrations between those two branches.
Something like:
git diff branch1 branch2 | grep /db/migrate
How can I do it?

This command will diff their whole history:
git diff branch1..branch2 --name-only
If you want to compare from their last common ancestor, then:
git diff branch1...branch2 --name-only
And now you can grep files that you want. From there it's easy to write a little shell script that diffs two branches, file by file.
filenames=$(git diff branch1...branch2 --name-only | grep /db/migratons)
IFS=' '
read -r -a filearr <<< "$filenames"
for filename in "${filearr[#]}"
do
echo $(git diff branch1...branch2 -- "$filename")
done
Create the git-command-name file and put it into the user/bin folder (you should parametrize input - branches as variables).
Git will recognise it as a command that you can call with:
git command-name branch1 branch2

It's even easier if you want to compare the current branch to another. While those familiar with git will think this obvious, I'm including this for people that are starting out with git.
git diff other_branch_name --name-only

Related

Count number of subdirectiories inside each subdirectory of git repo using simple git commands or shell scripting

I have a repo with the following structure and I wouldlike to countnumber of subdirectories inside tier1, tier2 and tier3. I dont want to count subdirectories within the subdirectory. For examplee i have folders named a 1, 2, 3 inside tier1 and i wanted to see the count as 3. I dont want whats isnide those 1,2,3 folders.
Git clone actions should be avoided, as we do not need a local clone of the whole repo plus all history information. A simple fetch will be enough, is there are any leaner ways to retrieve the directory information ??
Am presently counting number of subdirectories by, entering each folder and the folloing command:
ls | wc -l
Git clone actions should be avoided, as we do not need a local clone of the whole repo plus all history information. A simple fetch will be enough, is there are any leaner ways to retrieve the directory information ??
You can filter your clones to skip the actual content, just get the structure. For the linux repo this is a ~2.5M download, a ~99% savings:
git clone --bare --depth 1 --filter=blob:none u://r/l checkit
git -C checkit ls-tree --name-only -d #:
lists the toplevel directories, then it's just a formatting question,
for d in $(git -C checkit ls-tree --name-only -d #:)
do printf '%7d %s\n' $(git -C checkit ls-tree -d #:$d|wc -l) $d
done
ls at best gives you the number of non-hidden entries directly inside the directory. If you have among them a plain file, or an entry containing spaces, or an entry where the name strats with a period, or a directory entry which is a directory, but has itself subdirectories, your count will be wrong.
I would instead do a
shopt -s nullglob
for topdir in tier[1-3]
do
if [[ -d $topdir ]]
then
a=($(find "$topdir" -type d))
echo "Number of subdirectories below $topdir is ${#a[#]}"
fi
done
The purpose of setting nullglob is only to avoid an error message if no tier directory exists.
UPDATE: If you are not interested in subdirectories further down, you could do instead of the find a
shopt -s dotglob
a=("$topdir"/*/)
The dotglob ensures that you are not missing hidden subdirectories, and the final slash in the glob-pattern ensures that the expansion happens only for entries which denote a directory.
git ls-tree -d HEAD:<dir>
git ls-tree -d HEAD:<dir> | wc -l
you can replace HEAD with any commit-ish reference :
git ls-tree -d origin/master:<dir>
git ls-tree -d v2.0.3:<dir>
git ls-tree -d <sha>:<dir>
...
If you want the list of all (recursive) subdirectories, add the -r option.

Bash filter out files with a given extension

I've added a pre-commit hook to run Rubocop against any files that are being staged for commit. However, Rubocop errors out when you give it a .png or .svg file. I've added those file extensions to the exclude block in .rubocop.yml but because I'm actually explicitly supplying files by name that doesn't seem to do the trick.
Here's my pre-commit script:
#!/usr/bin/env bash
set -e
git diff --staged --diff-filter=d --name-only | xargs bundle exec rubocop
I think the approach is grabbing the list of files from that git command and looping through them, filtering out any that are .png or .svg, but honestly I don't know how to do that. Any suggestions on filtering the files by extension?
One idea might be to filter out those extensions using grep.
git diff --staged --diff-filter=d --name-only \
| grep --invert-match '\.\(png\|svg\)$' \
| xargs bundle exec rubocop
There is a flag --only-recognized-file-types which I think will do what you need.
git diff --staged --diff-filter=d --name-only -- ':!*.png' ':!*.svg'
or for easier control you could check attributes and set up the matching just in .gitattributes.
Say git help glossary and hunt up pathspec, there's probably a quicker route to find the docs for it but that's the one I know.

How to get only the changed part of the file to be proccessed in pre-commit hook?

I used the following shell script in pre-commit hooks,to get only the modified lines of a cpp file which is to be commit in git. But it is giving entire file which has changed lines. How could i get only the changed lines of a file to process for pre-commit check.
Here is the script which i used:
changed_files=$(git diff-index --cached $against | \
grep -E '[MA] .*\.(c|cpp|cc|cxx)$' | cut -f 2)
git diff --cached should show you the staged changes only.
I guess what you are looking for is:
git diff --cached --name-status | grep -E '[MAD] *.*\.(c|cpp|cc|cxx)$' | cut -f2
Also, you can try adding the --name-only option instead of piping the output through cut.
git diff-index --cached --name-only $against | grep -E '.*\.(c|cpp|cc|cxx)$'
If said file is staged (i.e. has been git add), then you can use git diff --staged or git diff --cached (both are synonyms).
If you have more than one file staged, you can specify which one you want to look at with git diff --staged [path/filename].

Search only for certain files with grep

I found out today about
git rev-parse --show-toplevel && git ls-files
Which searches the top directory of a gir repository for all tracked files. Is there a way that I can make grep respect its output and only search through those files?
I tried
git rev-parse --show-toplevel && git ls-files | grep -r "something"
but in my small tests showed that it piping wasn't actually working. It would behave the same as just a regular grep command.
I also tried (just as an example)
grep -r "something" --include=`git ls-files`
but I think that only works with single files, since it wasn't showing all possible matches
Your first assertion is incorrect, as the two commands are executed separately from one another (the second one only executed if the first one completes successfully).
I guess what you wanted in the first place was:
git ls-files "$(git rev-parse --show-toplevel)"
This passes the top-level directory as an argument to git ls-files.
To grep the list of files for something, you could use xargs:
git ls-files -z "$(git rev-parse --show-toplevel)" | xargs -0 grep 'something'
I've added the -z switch to ls-files and the corresponding -0 switch to xargs, so that both tools work with null-bytes in their input/output, which means that awkward characters in file names don't cause any problems.
I don't think that the -r switch is doing anything useful in grep, since the output of ls-files doesn't contain any directories (git only tracks files).

git add hook / getting git to ignore file modes of new files

I'm working with a git repository on both windows and linux/mac. When I create new files on windows, or edit them in some text editors, the file mode is changed to 775. I can get git to ignore file mode changes with
git config core.filemode false
but I also want most new files to have mode 664 (not 775). The best I've come up with so far is the pre-commit hook
git diff --cached --name-only | egrep -v '\.(bat|py|sh|exe)' | xargs -d"\n" chmod 664
git diff --cached --name-only | egrep -v '\.(bat|py|sh|exe)' | xargs -d"\n" git add
but this does the wrong thing if I've added a new file, then edited it again before commiting, and then commited without adding it. Is there a better way to do this, or something like a pre-add or post-add hook?
Edit: git diff --cached --name-only also gives me files that have been deleted, so what I really want is something like git diff --cached --name-only --diff-filter=ACMRTUXB
Instead of using chmod and readding, you can use git update-index --chmod=-x <files> to modify the index directly.
Take a look at git attributes. You can handle it there. Smudge/clean may be a way to deal with it.

Resources