I'm trying to avoid accidentally committing binaries into my repo. I considered a hook that detects filesizes above some threshold but I think it will be more useful to fail the pre-commit hook anytime my commit changes a file with an executable permission bit.
I know how to tackle this with python/ruby/other scripting languages but ideally I can do it with just bash. Any ideas?
I ended up with this. It lists the filenames being committed relative to REPO_ROOT. It passes those to ls with -1 flag for one-per-line and -F flag that appends * to executables. It greps for trailing *. Any matching grep fails the hook.
cd $REPO_ROOT
STAGED_EXECUTABLES=$(git diff --diff-filter=ACMRTUXB --cached HEAD --name-only | xargs ls -1F | egrep '\*$')
EXECUTABLES_MISSING=$?
if [ $EXECUTABLES_MISSING -eq 0 ]; then
echo "You tried to commit an executable file. Override with \`git commit --no-verify\` if required." > /dev/stderr
exit 1
fi
Related
I've been trying to make a pre-commit hook that checks if files in the index are formatted correctly. I've tried so many things already but I just can't get the grep to work correctly. This is my code right now:
for FILE in $(git diff --cached --name-only)
do
if [[ "$FILE" =~ \.(c|h|cpp|cc)$ ]]; then
exec C:/dev/uct/clang-format-15.0.3.exe --dry-run -Werror $FILE | grep -c "violations"
fi
done
I expect it to print the number of files that have formatting issues (I actually want to search for the string -Wclang-format-violations but I simplified it for testing).
If I &> the clang-format output to a file the violations correctly printed to that file.
exec C:/dev/uct/clang-format-15.0.3.exe --dry-run -Werror $FILE &> temp.txt
If I then run grep from the command line on that file it works.
grep -c violations temp.txt
I also tried grep on the file from the script but for some reason that didn't produce any output either (though I would prefer to not have to create a file).
exec C:/dev/uct/clang-format-15.0.3.exe --dry-run -Werror $FILE &> temp.txt
grep -c violations temp.txt
What am I doing wrong? I thought I would get this to work in half an hour at most.
The problem with grep is that you're not redirecting stderr, only stdout. (In your redirection to a file, you are redirecting stderr as well, with >&.) To capture stderr in the pipe, use |&.
However you have a mismatch between what's staged to be committed and what you're looking at. In git, a file isn't staged to be committed, but its contents are. That means that you can stage - for example - part of a files contents to be committed. The file's contents in the working directory thus would not match what is staged.
Here's a concrete example:
echo "hello" > hello_world.txt
git add hello_world.txt
echo "world" >> hello_world.txt
At this point, the contents staged in the index (from the git add on line 2) are hello. But the file contains hello world on disk.
git status will show the file both as staged and modified:
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: hello_world.txt
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: hello_world.txt
If you were to run git commit now, only the hello would be committed, not the file as it exists on disk.
That means that you want to look at the staged contents of the file -- which is what will actually be committed -- not just the file on disk. Otherwise you may get false positives or negatives.
To handle this case, you can pass what's staged to clang-format using the git show command. (git show :filename will output the staged contents of filename.)
Try this:
for FILE in $(git diff --cached --name-only)
do
if [[ "$FILE" =~ \.(c|h|cpp|cc)$ ]]; then
git cat-file --filters ":${FILE}" | C:/dev/uct/clang-format-15.0.3.exe --dry-run -Werror --assume-filename="${FILE}" |& grep -c grep -c "\-Wclang\-format\-violations"
fi
done
This question already has answers here:
How do I set a variable to the output of a command in Bash?
(15 answers)
Closed 5 months ago.
This is probably a simple one for a bash scripter, which I am not.
I'm running a cron job that downloads some data, and then depending on that data, may or may not modify a second file. After the job, I want to git commit one or both files. For the conditional commit, I tried this in a .sh script:
# attempt to capture whether MyNotes.txt was changed
# by counting lines in git status output
mywc=(git status -s MyNotes.txt | wc -l)
echo $mywc found!
if [ $mywc = 1 ]; then
echo Add file for commit
else
echo Nothing to add
fi
I'm pretty much getting nowhere; this thing seems to fail on the first line with syntax error near unexpected token '|'. If I run git status -s MyNotes.txt | wc -l on the command line, I get the numeric output I expect.
What am I doing wrong and how can I make this work?
If there's a more elegant way to determine whether a file changed, feel free to share.
Also, for my edification, how could I get this to work without the interim mywc variable? I.e., if I wanted to just do the command within the if, something like this:
if [[ $(git status -s MyNotes.txt | wc -l) = 1 ]]; then
...
Thanks!
What am I doing wrong and how can I make this work?
put a dollar before parenthesis.
foo=$(command)
The thing you are using looks like a bash array
declare -a letters=(a b c d)
If there's a more elegant way to determine whether a file changed, feel free to share.
Consider this:
$ git diff -s --exit-code README.md || echo has changed
has changed
$ git checkout README.md
Updated 1 path from the index
$ git diff -s --exit-code README.md || echo has changed
The OR (||) runs if the first command exits with a non-zero code.
Same thing essentially:
$ false || echo false exits with 1
false exits with 1
$ true || echo will not trigger
An aspect of bash that people overlook is that [[, ]], [ and ] are separate commands. They have return codes too. With this knowledge, you can leverage the return codes with if and any other command.
$ if true; then echo yes; else echo no; fi
yes
$ if false; then echo yes; else echo no; fi
no
So for detecting changes in a tracked file:
$ if git diff -s --exit-code README.md; then echo same as in index; else echo changed; fi
same as in index
$ echo 123 >> README.md
$ if git diff -s --exit-code README.md; then echo same as in index; else echo changed; fi
changed
With all of that said...
Just add the file. You don't need to check anything. If it hasn't changed, nothing will happen.
$ echo foo >> myfile
$ git add myfile
$ git commit -m 'maybe changed' myfile
[master b561cc1] maybe changed
1 file changed, 1 insertion(+), 1 deletion(-)
$ git add myfile
$ git commit -m 'maybe changed' myfile
no changes added to commit (use "git add" and/or "git commit -a")
if you need to avoid a non-zero exit code (such as with set -e), just put a || true after the command that you want to ignore the exit status of:
$ cat foo.sh
#!/bin/basho
set -e
echo foo >> myfile
git add myfile
git commit -m 'maybe changed' myfile
git add myfile
git commit -m 'maybe changed' myfile > /dev/null || true
echo no error here. it\'s fine..
false
echo fill never reach this.
Try running that script and see what happens
I search for a way for checking if file changed.
git diff --exit-code -s <path>
Now the bash scripter knows that every command returns a status code which can be checked with $?. In case everything went smoothly, 0 is returned. In that case we get 0 if file is not changed.
Every bash scripter knows too that you can use that with && and || operators (because of lazy evaluation) to write such construct:
git diff --exit-code -s <path> && echo "should add file"
About your edification, what you wrote is perfectly fine!
As CryptoFool pointed out in a comment, I failed to include a $ in my variable assignment. Simple fix in the first line of my script:
mywc=$(git status -s MyNotes.txt | wc -l)
As matt pointed out in a subsequent comment, doing a git add on a file that hasn't changed has no effect. It won't stage the file for commit. So instead of doing conditional logic to determine whether to git add myfile.txt, I'll just blindly execute git add myfile.txt, which will either stage the file if there are changes, or do nothing if there are no changes. Therefore, my entire script can be replaced with one line:
git add MyNotes.txt
git cherry-pick --strategy=recursive -X theirs -n "$GitHub_SHA"
Hello, I'm specifying -s recursive -X theirs, why is it still necessary to git rm and git add?
Also how do i differentiate when to git rm or git add in my bash script?
git cherry-pick --strategy=recursive -X theirs -n "$GitHub_SHA" || merge_conflict=1
if [ ${merge_conflict:-0} -eq 1 ]
then
git add -all
fi
-X theirs and -X ours act on conflicting hunks. If a file has been removed by one side, no conflicting hunks are generated. At the same time, the context lines are not generated either. Without the hunks and context lines, the strategy option cannot pick lines.
To differentiate one conflicting type from another, you could try git status -s and parse the result in short format. For details on short format, see Short Format. For example, for a deleted by us file foo.txt, git status -s foo.txt prints DU foo.txt. For both modified, it's UU foo.txt.
i've made the following bash script to commit the parent repo after some change in submodule.
it's all about that the script want to cd .. to check the parent repo current branch but the problem is that the cd .. is not affecting the upcoming commands because i guess the subshell
i've tried to run
1- cd ../ && before each command
2- make alias but didn't succeed
3- run exec but the script didn't continued
#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git commit" with no arguments. The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "post-commit".
commit_msg= git log -1 --pretty=%B
if [[ $(git branch | grep \* | cut -d ' ' -f2) == "int1177/next" ]]; then
cd ..
if [[ $(git branch | grep \* | cut -d ' ' -f2) == "B0/next" ]]; then
git add 6_Tests
git commit -m "bs esss"
echo "development branch B0/next has now new commit"
else
echo "development branch isn't B0/next"
fi
else
echo "current branch isn't int1177/next"
fi
Actually, this particular problem is not a bash issue, but rather a Git issue.
Why doesn't "cd" work in a shell script? is valid in general, and is a suitable answer to many other questions. But this particular post-commit hook is trying to chdir out of a submodule into its parent superproject, then make a commit within the parent superproject. That is possible. It may be a bad idea for other reasons—in general it's unwise to have Git commit hooks create commits, even in other repositories1—but in this particular case you're running into the fact that Git finds its directories through environment variables.
In particular, there's an environment variable GIT_DIR that tells Git: The .git directory containing the repository is at this path. When Git runs a Git hook, Git typically sets $GIT_DIR to . or .git. If $GIT_DIR is not set, Git will find the .git directory by means of a directory-tree search, but if $GIT_DIR is set, Git assumes that $GIT_DIR is set correctly.
The solution is to unset GIT_DIR:
unset GIT_DIR
cd ..
The rest of the sub-shell commands will run in the one-step-up directory, and now that $GIT_DIR is no longer set, Git will search the superproject's work-tree for the .git directory for the superproject.
As an aside, this:
$(git branch | grep \* | cut -d ' ' -f2)
is a clumsy way to get the name of the current branch. Use:
git rev-parse --abbrev-ref HEAD
instead, here. (The other option is git symbolic-ref --short HEAD but that fails noisily with a detached HEAD, while you probably want the quiet result to be just the word HEAD, which the rev-parse method will produce.)
1The main danger in this case is that the superproject repository is not necessarily in any shape to handle a commit right now. Edit: or, as discovered in this comment, is not even set up to be a superproject for that submodule, yet, much less to have a submodule-updating commit added.
I have a bare remote repository with source files in which I want to build only the changed files after it has been pushed to. I thought the best way to detect which files have been changed would be by putting the command
changed_files=$(git diff-tree --no-commit-id --name-only -r HEAD) into a post-receive hook.
However, the variable ends up empty as I have verified by echoing it into a file. If I put HEAD^ instead of HEAD, it does show the changed files of the second to most recent commit. However, it doesn't show the most recent changes when I put HEAD but just shows nothing.
Can anyone help me? Or is there a smarter approach to my problem altogether?
I would definitely prefer a lean approach like automatically triggering a build with a push over one that would have to e. g. periodically check for changes.
At the point the post-receive hook is executed, all the references have already been updated. Therefore, HEAD means the new head, not the old one.
This may not produce the results you want, since it assumes that there is one non-merge commit and you want to diff with its parent, while you may have pushed a merge or multiple commits.
What you probably want to do is take advantage of the standard input which provides the old and new values. Something like the following will print the changed files as output from the remote side when you push:
#!/bin/sh
while read old new ref
do
# Handle created or deleted branches.
echo $old | grep -qsE '^0+$' && old=$(git hash-object -t tree /dev/null)
echo $new | grep -qsE '^0+$' && new=$(git hash-object -t tree /dev/null)
git diff-tree --no-commit-id --name-only -r "$old" "$new"
done
OK, I've figured it out: I was getting a
remote: fatal: ambiguous argument 'HEAD': both revision and filename
error in the push command which I had not noticed. After changing
changed_files=$(git diff-tree --no-commit-id --name-only -r HEAD)
to
changed_files=$(git diff-tree --no-commit-id --name-only -r HEAD --)
everything is working fine. Apparently, this is caused by the hook being executed in the .git directory of the remote repository, and there is a file called HEAD in that directory, which makes referring to the HEAD revision as HEAD ambiguous.