Consider only files being pushed in pre-push hook git - ruby

I want to run lint on code for pre-push without considering the local changed files the user has. for example change in file A is being pushed and the same got changed in the local changes I want to consider the code being pushed by user.
How to implement this using hit-hooks.
Alternate ways I tried:
Limiting the user to reset the changes in pre-push - The functionality is limited in this case

There are several ways to get files from git storage and write them to disk, but I don't know of a direct command to say straight away "checkout files A, B and C from commit xxx to that directory on disk".
The simplest way is probably to use git worktree add (but this checks out all files, not just the ones you want) :
git worktree add /tmp/myhook.xyz <commit-sha>
The most direct way is to use git --work-tree=... (or GIT_WORK_TREE=...) to target some other directory on disk :
git --work-tree=/tmp/myhook.xyz checkout <commit-sha> -- file1 file2 path/to/file3
How to use this in a pre-push hook:
for each pushed reference, you can :
compare the local commit and remote commit to list files that were modified,
use the above trick to checkout the files from local commit in a specific destination on disk :
# pre-push:
#!/bin/bash
zero=$(git hash-object --stdin </dev/null | tr '[0-9a-f]' '0')
list_modified_files () {
local $local_commit=$1
local $remote_commit=$2
if [ "$local_commit" = "$zero" ]; then
return
fi
if [ "$remote_oid" = "$zero" ]; then
git ls-tree -r --name-only $local_oid
else
git diff --no-renames --diff-filter=AM --name-status $remote_oid $local_oid
fi
}
while read local_ref local_oid remote_ref remote_oid
do
echo "'$local_ref' '$local_oid' '$remote_ref' '$remote_oid'"
tmpdir=$(mktemp -d /tmp/myprepushhook.XXXXXX)
list_modified_files | xargs -r git --work-tree "$tmpdir" checkout "$local_oid" --
# run linter on files in $tmpdir ...
rm -rf "$tmpdir"
done

Related

Git: Can't push after large committed file was part of previous commit [duplicate]

I accidentally dropped a DVD-rip into a website project, then carelessly git commit -a -m ..., and, zap, the repo was bloated by 2.2 gigs. Next time I made some edits, deleted the video file, and committed everything, but the compressed file is still there in the repository, in history.
I know I can start branches from those commits and rebase one branch onto another. But what should I do to merge the 2 commits so that the big file doesn't show in the history and is cleaned in the garbage collection procedure?
Use the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch specifically designed for removing unwanted files from Git history.
Carefully follow the usage instructions, the core part is just this:
$ java -jar bfg.jar --strip-blobs-bigger-than 100M my-repo.git
Any files over 100MB in size (that aren't in your latest commit) will be removed from your Git repository's history. You can then use git gc to clean away the dead data:
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
After pruning, we can force push to the remote repo*
$ git push --force
*NOTE: cannot force push a protect branch on GitHub
The BFG is typically at least 10-50x faster than running git-filter-branch, and generally easier to use.
Full disclosure: I'm the author of the BFG Repo-Cleaner.
What you want to do is highly disruptive if you have published history to other developers. See “Recovering From Upstream Rebase” in the git rebase documentation for the necessary steps after repairing your history.
You have at least two options: git filter-branch and an interactive rebase, both explained below.
Using git filter-branch
I had a similar problem with bulky binary test data from a Subversion import and wrote about removing data from a git repository.
Say your git history is:
$ git lola --name-status
* f772d66 (HEAD, master) Login page
| A login.html
* cb14efd Remove DVD-rip
| D oops.iso
* ce36c98 Careless
| A oops.iso
| A other.html
* 5af4522 Admin page
| A admin.html
* e738b63 Index
A index.html
Note that git lola is a non-standard but highly useful alias. (See the addendum at the end of this answer for details.) The --name-status switch to git log shows tree modifications associated with each commit.
In the “Careless” commit (whose SHA1 object name is ce36c98) the file oops.iso is the DVD-rip added by accident and removed in the next commit, cb14efd. Using the technique described in the aforementioned blog post, the command to execute is:
git filter-branch --prune-empty -d /dev/shm/scratch \
--index-filter "git rm --cached -f --ignore-unmatch oops.iso" \
--tag-name-filter cat -- --all
Options:
--prune-empty removes commits that become empty (i.e., do not change the tree) as a result of the filter operation. In the typical case, this option produces a cleaner history.
-d names a temporary directory that does not yet exist to use for building the filtered history. If you are running on a modern Linux distribution, specifying a tree in /dev/shm will result in faster execution.
--index-filter is the main event and runs against the index at each step in the history. You want to remove oops.iso wherever it is found, but it isn’t present in all commits. The command git rm --cached -f --ignore-unmatch oops.iso deletes the DVD-rip when it is present and does not fail otherwise.
--tag-name-filter describes how to rewrite tag names. A filter of cat is the identity operation. Your repository, like the sample above, may not have any tags, but I included this option for full generality.
-- specifies the end of options to git filter-branch
--all following -- is shorthand for all refs. Your repository, like the sample above, may have only one ref (master), but I included this option for full generality.
After some churning, the history is now:
$ git lola --name-status
* 8e0a11c (HEAD, master) Login page
| A login.html
* e45ac59 Careless
| A other.html
|
| * f772d66 (refs/original/refs/heads/master) Login page
| | A login.html
| * cb14efd Remove DVD-rip
| | D oops.iso
| * ce36c98 Careless
|/ A oops.iso
| A other.html
|
* 5af4522 Admin page
| A admin.html
* e738b63 Index
A index.html
Notice that the new “Careless” commit adds only other.html and that the “Remove DVD-rip” commit is no longer on the master branch. The branch labeled refs/original/refs/heads/master contains your original commits in case you made a mistake. To remove it, follow the steps in “Checklist for Shrinking a Repository.”
$ git update-ref -d refs/original/refs/heads/master
$ git reflog expire --expire=now --all
$ git gc --prune=now
For a simpler alternative, clone the repository to discard the unwanted bits.
$ cd ~/src
$ mv repo repo.old
$ git clone file:///home/user/src/repo.old repo
Using a file:///... clone URL copies objects rather than creating hardlinks only.
Now your history is:
$ git lola --name-status
* 8e0a11c (HEAD, master) Login page
| A login.html
* e45ac59 Careless
| A other.html
* 5af4522 Admin page
| A admin.html
* e738b63 Index
A index.html
The SHA1 object names for the first two commits (“Index” and “Admin page”) stayed the same because the filter operation did not modify those commits. “Careless” lost oops.iso and “Login page” got a new parent, so their SHA1s did change.
Interactive rebase
With a history of:
$ git lola --name-status
* f772d66 (HEAD, master) Login page
| A login.html
* cb14efd Remove DVD-rip
| D oops.iso
* ce36c98 Careless
| A oops.iso
| A other.html
* 5af4522 Admin page
| A admin.html
* e738b63 Index
A index.html
you want to remove oops.iso from “Careless” as though you never added it, and then “Remove DVD-rip” is useless to you. Thus, our plan going into an interactive rebase is to keep “Admin page,” edit “Careless,” and discard “Remove DVD-rip.”
Running $ git rebase -i 5af4522 starts an editor with the following contents.
pick ce36c98 Careless
pick cb14efd Remove DVD-rip
pick f772d66 Login page
# Rebase 5af4522..f772d66 onto 5af4522
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#
Executing our plan, we modify it to
edit ce36c98 Careless
pick f772d66 Login page
# Rebase 5af4522..f772d66 onto 5af4522
# ...
That is, we delete the line with “Remove DVD-rip” and change the operation on “Careless” to be edit rather than pick.
Save-quitting the editor drops us at a command prompt with the following message.
Stopped at ce36c98... Careless
You can amend the commit now, with
git commit --amend
Once you are satisfied with your changes, run
git rebase --continue
As the message tells us, we are on the “Careless” commit we want to edit, so we run two commands.
$ git rm --cached oops.iso
$ git commit --amend -C HEAD
$ git rebase --continue
The first removes the offending file from the index. The second modifies or amends “Careless” to be the updated index and -C HEAD instructs git to reuse the old commit message. Finally, git rebase --continue goes ahead with the rest of the rebase operation.
This gives a history of:
$ git lola --name-status
* 93174be (HEAD, master) Login page
| A login.html
* a570198 Careless
| A other.html
* 5af4522 Admin page
| A admin.html
* e738b63 Index
A index.html
which is what you want.
Addendum: Enable git lola via ~/.gitconfig
Quoting Conrad Parker:
The best tip I learned at Scott Chacon’s talk at linux.conf.au 2010, Git Wrangling - Advanced Tips and Tricks was this alias:
lol = log --graph --decorate --pretty=oneline --abbrev-commit
This provides a really nice graph of your tree, showing the branch structure of merges etc. Of course there are really nice GUI tools for showing such graphs, but the advantage of git lol is that it works on a console or over ssh, so it is useful for remote development, or native development on an embedded board …
So, just copy the following into ~/.gitconfig for your full color git lola action:
[alias]
lol = log --graph --decorate --pretty=oneline --abbrev-commit
lola = log --graph --decorate --pretty=oneline --abbrev-commit --all
[color]
branch = auto
diff = auto
interactive = auto
status = auto
Why not use this simple but powerful command?
git filter-branch --tree-filter 'rm -f DVD-rip' HEAD
The --tree-filter option runs the specified command after each checkout of the project and then recommits the results. In this case, you remove a file called DVD-rip from every snapshot, whether it exists or not.
If you know which commit introduced the huge file (say 35dsa2), you can replace HEAD with 35dsa2..HEAD to avoid rewriting too much history, thus avoiding diverging commits if you haven't pushed yet. This comment courtesy of #alpha_989 seems too important to leave out here.
See this link.
(The best answer I've seen to this problem is: https://stackoverflow.com/a/42544963/714112 , copied here since this thread appears high in Google search rankings but that other one doesn't)
🚀 A blazingly fast shell one-liner 🚀
This shell script displays all blob objects in the repository, sorted from smallest to largest.
For my sample repo, it ran about 100 times faster than the other ones found here.
On my trusty Athlon II X4 system, it handles the Linux Kernel repository with its 5,622,155 objects in just over a minute.
The Base Script
git rev-list --objects --all \
| git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
| awk '/^blob/ {print substr($0,6)}' \
| sort --numeric-sort --key=2 \
| cut --complement --characters=13-40 \
| numfmt --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
When you run above code, you will get nice human-readable output like this:
...
0d99bb931299 530KiB path/to/some-image.jpg
2ba44098e28f 12MiB path/to/hires-image.png
bd1741ddce0d 63MiB path/to/some-video-1080p.mp4
🚀 Fast File Removal 🚀
Suppose you then want to remove the files a and b from every commit reachable from HEAD, you can use this command:
git filter-branch --index-filter 'git rm --cached --ignore-unmatch a b' HEAD
After trying virtually every answer in SO, I finally found this gem that quickly removed and deleted the large files in my repository and allowed me to sync again: http://www.zyxware.com/articles/4027/how-to-delete-files-permanently-from-your-local-and-remote-git-repositories
CD to your local working folder and run the following command:
git filter-branch -f --index-filter "git rm -rf --cached --ignore-unmatch FOLDERNAME" -- --all
replace FOLDERNAME with the file or folder you wish to remove from the given git repository.
Once this is done run the following commands to clean up the local repository:
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now
Now push all the changes to the remote repository:
git push --all --force
This will clean up the remote repository.
100 times faster than git filter-branch and simpler
There are very good answers in this thread, but meanwhile many of them are outdated. Using git-filter-branch is no longer recommended, because it is difficult to use and awfully slow on big repositories.
git-filter-repo is much faster and simpler to use.
git-filter-repo is a Python script, available at github: https://github.com/newren/git-filter-repo . When installed it looks like a regular git command and can be called by git filter-repo.
You need only one file: the Python3 script git-filter-repo. Copy it to a path that is included in the PATH variable. On Windows you may have to change the first line of the script (refer INSTALL.md). You need Python3 installed installed on your system, but this is not a big deal.
First you can run
git filter-repo --analyze
This helps you to determine what to do next.
You can delete your DVD-rip file everywhere:
git filter-repo --invert-paths --path-match DVD-rip
Filter-repo is really fast. A task that took around 9 hours on my computer by filter-branch, was completed in 4 minutes by filter-repo. You can do many more nice things with filter-repo. Refer to the documentation for that.
Warning: Do this on a copy of your repository. Many actions of filter-repo cannot be undone. filter-repo will change the commit hashes of all modified commits (of course) and all their descendants down to the last commits!
These commands worked in my case:
git filter-branch --force --index-filter 'git rm --cached -r --ignore-unmatch oops.iso' --prune-empty --tag-name-filter cat -- --all
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now
It is little different from the above versions.
For those who need to push this to github/bitbucket (I only tested this with bitbucket):
# WARNING!!!
# this will rewrite completely your bitbucket refs
# will delete all branches that you didn't have in your local
git push --all --prune --force
# Once you pushed, all your teammates need to clone repository again
# git pull will not work
According to GitHub Documentation, just follow these steps:
Get rid of the large file
Option 1: You don't want to keep the large file:
rm path/to/your/large/file # delete the large file
Option 2: You want to keep the large file into an untracked directory
mkdir large_files # create directory large_files
touch .gitignore # create .gitignore file if needed
'/large_files/' >> .gitignore # untrack directory large_files
mv path/to/your/large/file large_files/ # move the large file into the untracked directory
Save your changes
git add path/to/your/large/file # add the deletion to the index
git commit -m 'delete large file' # commit the deletion
Remove the large file from all commits
git filter-branch --force --index-filter \
"git rm --cached --ignore-unmatch path/to/your/large/file" \
--prune-empty --tag-name-filter cat -- --all
git push <remote> <branch>
I ran into this with a bitbucket account, where I had accidentally stored ginormous *.jpa backups of my site.
git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all
Relpace MY-BIG-DIRECTORY with the folder in question to completely rewrite your history (including tags).
source: https://web.archive.org/web/20170727144429/http://naleid.com:80/blog/2012/01/17/finding-and-purging-big-files-from-git-history/
Just note that this commands can be very destructive. If more people are working on the repo they'll all have to pull the new tree. The three middle commands are not necessary if your goal is NOT to reduce the size. Because the filter branch creates a backup of the removed file and it can stay there for a long time.
$ git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch YOURFILENAME" HEAD
$ rm -rf .git/refs/original/
$ git reflog expire --all
$ git gc --aggressive --prune
$ git push origin master --force
git filter-branch --tree-filter 'rm -f path/to/file' HEAD
worked pretty well for me, although I ran into the same problem as described here, which I solved by following this suggestion.
The pro-git book has an entire chapter on rewriting history - have a look at the filter-branch/Removing a File from Every Commit section.
If you know your commit was recent instead of going through the entire tree do the following:
git filter-branch --tree-filter 'rm LARGE_FILE.zip' HEAD~10..HEAD
This will remove it from your history
git filter-branch --force --index-filter 'git rm -r --cached --ignore-unmatch bigfile.txt' --prune-empty --tag-name-filter cat -- --all
Use Git Extensions, it's a UI tool. It has a plugin named "Find large files" which finds lage files in repositories and allow removing them permenently.
Don't use 'git filter-branch' before using this tool, since it won't be able to find files removed by 'filter-branch' (Altough 'filter-branch' does not remove files completely from the repository pack files).
I basically did what was on this answer:
https://stackoverflow.com/a/11032521/1286423
(for history, I'll copy-paste it here)
$ git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch YOURFILENAME" HEAD
$ rm -rf .git/refs/original/
$ git reflog expire --all
$ git gc --aggressive --prune
$ git push origin master --force
It didn't work, because I like to rename and move things a lot. So some big file were in folders that have been renamed, and I think the gc couldn't delete the reference to those files because of reference in tree objects pointing to those file.
My ultimate solution to really kill it was to:
# First, apply what's in the answer linked in the front
# and before doing the gc --prune --aggressive, do:
# Go back at the origin of the repository
git checkout -b newinit <sha1 of first commit>
# Create a parallel initial commit
git commit --amend
# go back on the master branch that has big file
# still referenced in history, even though
# we thought we removed them.
git checkout master
# rebase on the newinit created earlier. By reapply patches,
# it will really forget about the references to hidden big files.
git rebase newinit
# Do the previous part (checkout + rebase) for each branch
# still connected to the original initial commit,
# so we remove all the references.
# Remove the .git/logs folder, also containing references
# to commits that could make git gc not remove them.
rm -rf .git/logs/
# Then you can do a garbage collection,
# and the hidden files really will get gc'ed
git gc --prune --aggressive
My repo (the .git) changed from 32MB to 388KB, that even filter-branch couldn't clean.
git filter-branch is a powerful command which you can use it to delete a huge file from the commits history. The file will stay for a while and Git will remove it in the next garbage collection.
Below is the full process from deleteing files from commit history. For safety, below process runs the commands on a new branch first. If the result is what you needed, then reset it back to the branch you actually want to change.
# Do it in a new testing branch
$ git checkout -b test
# Remove file-name from every commit on the new branch
# --index-filter, rewrite index without checking out
# --cached, remove it from index but not include working tree
# --ignore-unmatch, ignore if files to be removed are absent in a commit
# HEAD, execute the specified command for each commit reached from HEAD by parent link
$ git filter-branch --index-filter 'git rm --cached --ignore-unmatch file-name' HEAD
# The output is OK, reset it to the prior branch master
$ git checkout master
$ git reset --soft test
# Remove test branch
$ git branch -d test
# Push it with force
$ git push --force origin master
NEW ANSWER THAT WORKS IN 20222.
DO NOT USE:
git filter-branch
this command might not change the remote repo after pushing. If you clone after using it, you will see that nothing has changed and the repo still has a large size. this command is old now. For example, if you use the steps in https://github.com/18F/C2/issues/439, this won't work.
You need to use
git filter-repo
Steps:
(1) Find the largest files in .git:
git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)
(2) Start filtering these large files:
git filter-repo --path-glob '../../src/../..' --invert-paths --force
or
git filter-repo --path-glob '*.zip' --invert-paths --force
or
git filter-repo --path-glob '*.a' --invert-paths --force
or
whatever you find in step 1.
(3)
git remote add origin git#github.com:.../...git
(4)
git push --all --force
git push --tags --force
DONE!!!
You can do this using the branch filter command:
git filter-branch --tree-filter 'rm -rf path/to/your/file' HEAD
When you run into this problem, git rm will not suffice, as git remembers that the file existed once in our history, and thus will keep a reference to it.
To make things worse, rebasing is not easy either, because any references to the blob will prevent git garbage collector from cleaning up the space. This includes remote references and reflog references.
I put together git forget-blob, a little script that tries removing all these references, and then uses git filter-branch to rewrite every commit in the branch.
Once your blob is completely unreferenced, git gc will get rid of it
The usage is pretty simple git forget-blob file-to-forget. You can get more info here
https://ownyourbits.com/2017/01/18/completely-remove-a-file-from-a-git-repository-with-git-forget-blob/
I put this together thanks to the answers from Stack Overflow and some blog entries. Credits to them!
Other than git filter-branch (slow but pure git solution) and BFG (easier and very performant), there is also another tool to filter with good performance:
https://github.com/xoofx/git-rocket-filter
From its description:
The purpose of git-rocket-filter is similar to the command git-filter-branch while providing the following unique features:
Fast rewriting of commits and trees (by an order of x10 to x100).
Built-in support for both white-listing with --keep (keeps files or directories) and black-listing with --remove options.
Use of .gitignore like pattern for tree-filtering
Fast and easy C# Scripting for both commit filtering and tree filtering
Support for scripting in tree-filtering per file/directory pattern
Automatically prune empty/unchanged commit, including merge commits
This works perfectly for me : in git extensions :
right click on the selected commit :
reset current branch to here :
hard reset ;
It's surprising nobody else is able to give this simple answer.
git reset --soft HEAD~1
It will keep the changes but remove the commit then you can re-commit those changes.

Backup using GIT - add, commit and push everything including other GIT repositories

I want to build a backup system. I have a server (an old pc) and I boot it up via magic packets. For this purpose, I've written a batch Script doing this. On the server is a git server running with a couple of repositories. I can easily push to it.
But here's the problem:
My project folder contains itself a couple of git repositories. If I now add all files with git add . an Error is thrown, that I should add the submodules. This is the post talking about it Automatically Add All Submodules to a Repo. I've modified the shell script, to work with spaces, and to automatically commit and push everything while executing.
cd "D:\Projekts"
find . -type d | while read x ; do
if [ -d "${x}/.git" ] ; then
cd "${x}"
origin="$(git config --get remote.origin.url)"
cd - 1>/dev/null
echo ""
echo git submodule add "${origin}" "${x}"
git submodule add "${origin}" "${x}"
fi
done
echo "done adding all submodules"
git add .
git commit -am "new backup"
git push
echo "done pushing"
And this doesn't throw an error so far. But still the folders containing all those repositories are empty, if I clone the repository.
sidenote: I add the remote to the backup repository like this: $ git remote add --mirror=fetch origin git#<ip>:backup.git
Thanks in advance for your time,
Hellow2 :)

howto find out which git submodule current directory belongs to

setup
i have a git repo located in /home/v/git_repo, in which i have a submodule localted in subdirectory ./a/b/c.
$ cat /home/v/git_repo/.gitmodules
[submodule "foo/bar"]
path = a/b/c
url = git#github.com:username/repo.git
having the full path or only the in-repository subpath (that i have implemented in helper script git-where-in-repo-am-i-currently)
$ pwd
/home/v/git_repo/a/b/c/something
$ git where-in-repo-am-i-currently
a/b/c/something
question
i want to find out (preferably in fish) which submodule this path belongs to: e.g
$ git which-submodule (pwd)
foo/bar
to later use it to query that submodules status like
$ git -C (git rev-parse --git-dir)/modules/(git which-submodule) status
on branch master
Your branch is up to date with 'origin/master'
and ultimately display this information in my prompt (that part is already implemented)
what i tried
parsing the output of
$ git -C (git rev-parse --show-toplevel) config --file=.gitmodules --get-regexp "path"`
submodule.foo/bar.path a/b/c
and comparing my sub-directory path to that of a submodule, but it was rather a mess, with splitting pathes into arrays and all kinds of hacks
For the usual setup you've described here, with the worktree nesting matching the submodule nesting, you can
mytoplevel=`git rev-parse --show-toplevel`
abovethat=`git -C "$mytoplevel"/.. rev-parse --show-toplevel`
Then,
echo ${mytoplevel#$abovethat/}
will get you the submodule path in the superproject, or you can
echo ${PWD#$abovethat/}
to get your current directory's path relative to the superproject.
So:
me=`git rev-parse --show-toplevel`
up=`git -C "$me"/.. rev-parse --show-toplevel`
subpath=${me#$up/}
git -C "$up" config -f .gitmodules --get-regexp '^submodule\..*\.path$' ^$subpath$
gets you the current repo's submodule name and path from the config entry in its superproject.
Git can be useful in any imaginable build system, though; it doesn't impose restrictions on how things outside its remit are set up. So short of an exhaustive search of the filesystem namespace you can't be sure you've found everybody using any worktree as a submodule checkout, there's just no reason for Git to care how a repository is used.
For instance, if multiple projects all need to run off the same submodule rev, you can have a single repo and worktree serve as a shared submodule for them all: rather than have to go through every single one of them and do synchronized checkouts, and then trusting that you haven't missed one, just use one repo, with one worktree, and point everybody using it at that.
For workflows with that need, this can be compellingly better than the usual setup, all users by definition see a synchronized, current submodule revision and any client who needs to know "what's new" with an update can e.g. git -C utils diff `git rev-parse :utils` HEAD, every submodule user effectively has their own tracking branch and can use all of Git's tools to help stay current or resolve conflicts.
So, to recreate your setup, I do:
git init git_repo; cd $_
mkdir a/b; git init a/b/c; cd $_
mkdir something; touch something/somefile;
git add .; git commit -m-
cd `git -C .. rev-parse --show-toplevel`
git submodule add --name foo/bar ./a/b/c -- a/b/c
git add .; git commit -m-
Then I get this when I try it:
$ find -print -name .git -prune
.
./a
./a/b
./a/b/c
./a/b/c/something
./a/b/c/something/somefile
./a/b/c/.git
./.gitmodules
./.git
$ git grl
core.repositoryformatversion 0
core.filemode true
core.bare false
core.logallrefupdates true
submodule.foo/bar.url /home/jthill/src/snips/git_repo/a/b/c
submodule.foo/bar.active true
$ cd a/b/c/something
$ me=`git rev-parse --show-toplevel`
$ up=`git -C "$me"/.. rev-parse --show-toplevel`
$ subpath=${me#$up/}
$ git -C "$up" config -f .gitmodules --get-regexp '^submodule\..*\.path$' ^$subpath$
submodule.foo/bar.path a/b/c
$ echo $me $up $subpath
/home/jthill/src/snips/git_repo/a/b/c /home/jthill/src/snips/git_repo a/b/c
If there's a difference between this setup and what you've described, I'm missing it, I've got the directory structure, the submodule name, the start directory... if you'll step through that and find where the setup or results diverge from yours I think that'd help.

Test push conflicts on git push via Pre-Receive Hook

I'm making a Pre-Receive Hook on BitBucket that is supposed to confirm that all pushes made in a branch are up-to-date with parent Branches.
I mean, in a temporal evolution, we have several branches creations:
Branch creation during time
With the above example os 3 branches, Dev, Feature1, and my Local, i want to, before making push of Local to remote/origins/Feature1, make git merge from the latest Feature1 with the recent on-push Local Code. In this way, i can confirm that, whoever is making the push, is using the latest version of feature1, and there will be no conflict.
If it were any conflict, i would return 1, to avoid making the push! and obligate the Developer to pull from Feature before push is code.
This is my script on Pre-Receive Hook.
while read from_ref to_ref ref_name; do
echo "Ref update:"
echo " Old value: $from_ref"
echo " New value: $to_ref"
echo " Ref name: $ref_name"
echo " Diff:"
git clone --progress -v $GIT_URL $CLONE_DIR1
cd $CLONE_DIR1
git checkout -b test remotes/origin/Feature1
git merge --no-commit -m "Merging feature with local on-push code" $ref_name
(....)
done
I've tried with ref_name, to_ref, and having no success.
Anyone can help me?
How can I access the recent pushed code, and merge by parent branch with this code?
This seems like a very odd thing to do, and it is probably doomed to failure. It will certainly be complicated and you will want to change your test behavior based on which ref(s) are being updated and whether these update add merge commit(s).
That said, there are some special rules for pre-receive and update hooks, and if you obey them you will get somewhat further:
Do not chdir or cd away from the current directory. Or, if you do, make sure you chdir back, but usually it's not too difficult to make sure that operations that must run in another directory, run as a separate process: either a sub-shell, or another script.
Remove $GIT_DIR from the environment before attempting git commands that need to use a different repository. The reason is that the hook is run in the top level directory with $GIT_DIR set to either .git (non-bare repo) or . (bare repository).
Putting those two together, you might move all your verifier code into a separate script and do something like this:
exitstatus=0
while read from_ref to_ref ref_name; do
... maybe some setup code here to see if $ref_name
is being created or destroyed ...'
case "$ref_name" in
... add cases as needed to choose action based on ref ...
if (unset GIT_DIR; /path/to/check_script arg1 arg2 ...); then
echo "push being rejected because ..."
exitstatus=1
fi
...
esac
...
done
exit $exitstatus
There is still one very big problem here. You want check_script to be able to access any proposed new commits that would become reachable from $ref_name if the hook script exits 0 so that the proposed update to it is allowed. That update has not yet occurred: $ref_name still points to the old SHA-1 $from_ref. Meanwhile, the new SHA-1 in $to_ref might not have any name pointing to it (though it does still exist in the underlying repository).
Among other things, if $to_ref points to new commits (the usual case), any clone you make at this point, via normal git operations, will not contain those commits, so you will not be able to use them.
There are two obvious ways to handle this:
Make a new (temporary) reference that points to $to_ref. You can then see the proposed commits in the clone.
Don't use a clone. Copy the repository some other way, or use the original repository itself directly, e.g., as an "alternate", or by creating a temporary work tree directory and pointing $GIT_WORK_TREE there, or using some of the new git worktree features that have appeared in git 2.6+. (If you choose the manual temporary work-tree method, be sure to think about the normal shared $GIT_INDEX_FILE as well.)
Remember also to check for forced pushes that remove commits from a branch, or even remove-some-and-add-others all in one push.
UPDATE:
This question is resolved for me.
The final code is this:
#!/bin/bash
DIR=xpto/external-hooks/code_review
CLONE_DIR=$DIR/test_conflict_push-$(date +%s)
GIT_URL=myGitUrl
exitStatus=0
read oldrev newrev refname
feature_branch=${refname##refs/heads/}
echo "Feature branch-> $feature_branch"
#Clone feature branch from remote repo to be update via merged.
git clone --progress -v $GIT_URL $CLONE_DIR
currentDir=$PWD
cd $CLONE_DIR
#create branch named 'latest' to put new and modify files
git checkout -b latest remotes/origin/$feature_branch
#go back to PWD otherwise cant make git diff
cd $currentDir
# Get the file names, without directory, of the files that have been modified
# between the new revision and the old revision
echo "Getting files"
files=`git diff --name-only ${oldrev} ${newrev}`
echo "Files -> $files"
# Get a list of all objects in the new revision
echo "Getting objects"
objects=`git ls-tree --full-name -r ${newrev}`
echo "objects -> $objects"
# Iterate over each of these files
for file in ${files}; do
# Search for the file name in the list of all objects
object=`echo -e "${objects}" | egrep "(\s)${file}\$" | awk '{ print $3 }'`
# If it's not present, then continue to the the next itteration
if [ -z ${object} ];
then
continue;
fi
# Otherwise, create all the necessary sub directories in the new temp directory
mkdir -p "${CLONE_DIR}/`dirname ${file}`" &>/dev/null
# and output the object content into it's original file name
git cat-file blob ${object} > ${CLONE_DIR}/${file}
done;
echo "Ready for start merging."
cd $CLONE_DIR
#add new files to branch
echo $(git add .)
#commit added and modify files to branch
echo $(git commit -a -m "Merge latest to original feature")
#get generated commit id
echo $(git log -1)
#create branch named 'merged' to merge above commited files
echo $(git checkout -b merged remotes/origin/$feature_branch)
#merge only occurs for madded and modify files!
echo "Merging committed files to 'merged' branch with from 'latest' branch."
mergeResult=$(git merge --no-commit latest)
echo "Merge Result -> $mergeResult"
##to lower case
if [[ "${mergeResult,,}" == *"conflict"* ]]
then
echo "Merge contains conflicts."
echo "Update your $feature_branch branch!"
exitStatus=1
else
echo "Merge don't contains conflict."
echo "Push to $feature_branch can proceed."
exitStatus=0
fi
#remove temporary branches
echo $(git checkout master)
echo $(git branch -D latest)
echo $(git branch -D merged)
#delete temporary clone dir
rm -rf $CLONE_DIR
exit $exitStatus
Many Thanks.

Bash conditional based on whether a file has changed since last pull

I'm writing a script for automatically updating a system of ours. Basically I want to do a pull and update from a remote hg repository, and then run some update scripts. Now the problem is that these update scripts takes a while to run and most of them only has to be run if there has been changes to their configurations.
Now my script looks like following:
if hg pull -u
then
run scripts
fi
What I want is something like
if hg pull -u && 'some changes was introduces in my/configuration/directory/*'
then
run scripts
fi
Any idea how to do this?
With templated output of hg incoming your can check before pull, which files will be modified on pull (if any will be) and will act accordingly
hg incoming --template "{files % '{file}\n'}" | grep SOMESTRING
You can use hg status to get a list of files that have been changed between revisions, for example files modified between tip and it's parent(tip^) that are in my\configuration\directory:
hg status my\configuration\directory\** -m --rev "tip^:tip"
I would recommend to pull, check if those files have been altered from the current revset, update, and then run your scripts if your config has changed. To me that looks easier than trying to store which revset you started with and figure it out after the update. (Note that I'm not great with bash/grep, so this is approximate and untested):
hg pull
cfgChngd = hg status -m my\config\dir\** -m --rev tip | grep "M my\config\"
hg update
if cfgChngd
runAllTheScripts
fi
You can use the status command to get a list of changed files -- also between different revisions:
HERE=`hg log --template '{node}' -r.`
hg pull -u
if hg st --rev $HERE:. | grep -q .
then
run scripts
fi

Resources