Get commit hash for a commit in a branch

Get commit hash for a commit in a branch - bash

In a git branch I store the latest commit hash SHA using
latest_sha=$(git log --pretty=oneline | head -1 | cut -d ' ' -f 1)
After a bunch of commits in this branch, how do I get the next commit SHA after the latest_sha.
Say if there are 5 commits made to this branch after the $latest_sha.
I want to always get the SHA of the first commit after the latest_sha.
b8eead8ba4ff375911af6
c2452680eb7731e4d36ca
da2e113ca4768f5f34730
95b98d42a6e567ed56fc2
716c4f84a855f48bee55c
6a7223a74269f925cfd9e---I need this one
e945bcfabf3fbafc85084---latest_sha
159df375376ded565bec0
d725350982626f46a8b80
56a4b6ca91d93acc8d751
de584608616b1ed99a554
3cfc15339a98bb286d5baa
6ae834bf36c90fbd81854
fa9bdebd0f814f04ee05ba
cc44c4d9ff14314c1255da
5a6145586a8fdcaa2da659
bfea8cfe121d24a0ff1525
Thanks!

A Git branch name, in a sense, is the latest hash ID. That is, if git log branchX shows you commit b8eead8ba4ff375911af6 first, then branchX is a name representing b8eead8ba4ff375911af6, and:
git show branchX
will show the same commit as:
git show b8eead8ba4ff375911af6
If you need the hash ID for some reason—e.g., because you're going to change the hash ID to which the branch name points, by adding new commits—the simplest command to get it is git rev-parse:
hash=$(git rev-parse refs/heads/$branch)
For the rest, see alfunx's answer. Note that if the commits form a diamond-shaped graph, e.g.:
I--J
/ \
...--G--H M--N <-- branchX
\ /
K--L
then there are two commits that immediately follow H, but either I nor K are ancestors of each other, they're only related by both being descendants of H and ancestors (grandparents, in this case) of merge commit M. Using git rev-list --ancestry-path ^<anything-identifying-H> <anything-identifying-branchX> will list commits I, J, K, L, M, and N. The listing will start at N and move back to M as its second entry, but at this point, Git now has a choice of whether to list J or L. This is where the sorting options you choose go into effect. The default sort is chronological by committer date-and-time stamp.
Having listed either J or L, Git can now list the parent of whichever commit it listed, or the remaining commit on the other fork of history. Git will list one of them. If it chose to list J first, then I, it must now list L and then K in that order; if it chose to list L first, then K, it must now show I and then J in that order. But it might also list them in the order J, L, K, I, for instance; or J, L, I, K. Adding --topo-order constrains git rev-list to avoid interleaving commits from the two legs.
The linearization order in complex graphs is generally problematic: there's no single solution that handles all cases. That's why git rev-list offers multiple sorting options.

You could do that with rev-list:
git rev-list --ancestry-path HEAD ^${latest_sha} | tail -n1
rev-list lists all reachable commits in reverse chronological order for the given branches/commits. The caret (^) here means "not", which makes Git exclude all reachable commits starting at the given commit.
Concretely that means: Include all commits reachable from HEAD, exclude all commits reachable from ${latest_sha}, and then take the oldest one there using tail.
Edit: Add --ancestry-path to make sure only commits that are in the direct ancestry path between the specified commits are used (as mentioned by #jthill).

git logs have a parent commit field, so you could do something like
git log --pretty=format:"%P %H" | awk '$1 == "<YOUR_HASH>" {print $2}'

Related

Git rev-list is it possible to output in in reverse order using GIT command only not bash?

trying to get only one result from selected commits returns the same output regardless of
output order.
TOTAL LIST LENGTH 130 COMMITS
The result is the same, but the list is different
showing reduced results :
$ git rev-list --reverse origin/a..origin/b -10
c4fe26e8ebc8a2ceb0129a6f7318b08d18126baa
fd302babdae3338a2d780b529ec8499867d1c330
d24b219372ff87ada2c196857f47f7a9c61f1fad
1eaf20b79e69ae4d729a2679bdc40f6b2d22958f
6a76950fd9eee705ed813aec0c44ac58ff3a030c
058e793dbcd507861880b21aacf2dd07d2b079ff
f8bb9225c4101bf1340e35abd609e526d2bde2c1
a01e72042582337ff74f64caa0e5a25ceeba6c8d
bc88772e4cb3be926639da6d71a57aaef507cbf0
315f11516b98454cb8732ac57b9cc53dff9460b5
$ git rev-list origin/a..origin/b -10
315f11516b98454cb8732ac57b9cc53dff9460b5
bc88772e4cb3be926639da6d71a57aaef507cbf0
a01e72042582337ff74f64caa0e5a25ceeba6c8d
f8bb9225c4101bf1340e35abd609e526d2bde2c1
058e793dbcd507861880b21aacf2dd07d2b079ff
6a76950fd9eee705ed813aec0c44ac58ff3a030c
1eaf20b79e69ae4d729a2679bdc40f6b2d22958f
d24b219372ff87ada2c196857f47f7a9c61f1fad
fd302babdae3338a2d780b529ec8499867d1c330
c4fe26e8ebc8a2ceb0129a6f7318b08d18126baa
$ git rev-list --max-count=1 origin/a..origin/b
315f11516b98454cb8732ac57b9cc53dff9460b5
$ git rev-list --reverse --max-count=1 origin/a..origin/b
315f11516b98454cb8732ac57b9cc53dff9460b5
List commits that are reachable by following the parent links from the
given commit(s), but exclude commits that are reachable from the
one(s) given with a ^ in front of them. The output is given in reverse
chronological order by default.
the command i use
git rev-list --max-count=1 $TARGET_BRANCH..$BASE_BRANCH
from documentation
Note that these are applied before commit ordering and formatting
options, such as --reverse.
-
-n
--max-count= Limit the number of commits to output.
using the following git version
git version 2.39.0.windows.1

The cited section from the documentation explains that the meaning of
git rev-list --max-count=1 --reverse $TARGET_BRANCH..$BASE_BRANCH
is:
Collect commits in the range $TARGET_BRANCH..$BASE_BRANCH in the usual way.
Truncate the collect list after the first entry (--max-count=1).
List the remaining commit in reverse order (--reverse).
Of course, if the list has only one entry, the printed result looks the same regardless if printed forward or in reverse.

As the docs you link say, git rev-list's --max-count is applied very early, before --reverse.
To get the effect you're asking for, use existing tools. git rev-list origin/a..origin/b | tail -1 to get the last entry.

Creating a report from git logs

The dream is to create a script to run through my repos and create a report. I would like the report to contain the repositories, their submodules, the merged PRs associated with each repo and a list of updated files for those PRs. This would be produced between two given dates on master. I have been looking at something like this (where the hash's represent the first merge on the first day and the last on the other):
git log --format='%h - %s' --stat a123456...c123456 > report
One of the issues I am having - and I am unsure of whether this can be done with git or whether it is better to manipulate the report afterwards - is that currently this brings back too much information. That is, I am getting a list of all files touched on each PR. What I would really like is a condensed list, where only the most recent update to a given file is listed. at the moment I am getting something like this:
c123456 - this is the third merge (#3) ../file4
b123456 - this is the second merge (#2) ../file1 ../file3
../file4
a123456 - this is the first merge (#1) ../file1 ../file2
../file3 ../file4
where what I would really like is something like this:
c123456 - this is the third merge (#3) ../file4
b123456 - this is the second merge (#2) ../file1 ../file3
a123456 - this is the first merge (#1) ../file2
Any help would be appreciated!

awk to the rescue!
$ awk -v RS= -F'\n' '{for(i=1;i<=NF;i++) printf "%s", $i=!a[$i]++?($i ORS):""; print ""}' file
c123456 - this is the third merge (#3)
../file4
b123456 - this is the second merge (#2)
../file1
../file3
a123456 - this is the first merge (#1)
../file2

Getting tracking information for a Git branch

How can I get tracking information (i.e. remote and branch name) about a specific local Git branch, preferably in one command? There seem to be many ways to do this, e.g.
git rev-parse --abbrev-ref --symbolic-full-name branch_name#{upstream}
However, it returns the upstream in the form 'origin/branch_name', which makes it difficult to figure out the separate parts (e.g. when remote or branch name contains '/'). Is there more reliable solution, preferably using a single Git command?

#RomainValeri in the answer suggested this command to display the tracking information.
git for-each-ref --format="%(upstream:short)" refs/heads/<yourBranch>
However, if you want to get rid of the slash then you can do this
git for-each-ref --format="%(upstream:remotename) %(upstream:lstrip=-1)" \
# Insert your separator here ^
refs/heads/<yourBranch>
From git-docs,
upstream
The name of a local ref which can be considered “upstream” from the
displayed ref. Respects :short, :lstrip and :rstrip in the same way as
refname above ...
For any remote-tracking branch %(upstream), %(upstream:remotename) and
%(upstream:remoteref) refer to the name of the remote and the name of
the tracked remote ref, respectively. In other words, the
remote-tracking branch can be updated explicitly and individually by
using the refspec %(upstream:remoteref):%(upstream) to fetch from
%(upstream:remotename).
More on lstrip,
If lstrip= < N > (rstrip= < N >) is appended, strips < N > slash-separated
path components from the front (back) of the refname (e.g.
%(refname:lstrip=2) turns refs/tags/foo into foo and
%(refname:rstrip=2) turns refs/tags/foo into refs). If < N > is a
negative number, strip as many path components as necessary from the
specified end to leave -< N > path components (e.g. %(refname:lstrip=-2)
turns refs/tags/foo into tags/foo and %(refname:rstrip=-1) turns
refs/tags/foo into refs). When the ref does not have enough
components, the result becomes an empty string if stripping with
positive < N >, or it becomes the full refname if stripping with
negative < N >. Neither is an error.
Some examples :
Format : "%(upstream:remotename):%(upstream:lstrip=-1)"
Output : <remote-name>:<branch-name>
Format : "%(upstream:remotename) %(upstream:lstrip=-1)"
Output : <remote-name> <branch-name>
If the branch name includes a slash, then lstrip won't work. Instead remoteref can be used.
git for-each-ref --format="%(upstream:remotename) %(upstream:remoteref)" refs/heads/<yourBranch>
The output is in this format : <remote-name> refs/heads/<branch-name>
To remove refs/heads/ from the output, pipe the above command to this
sed 's/refs\/heads\///g'

I'd use the built-in -v (verbose) or even -vv (very verbose) flag to get this from git branch output. You might also just grep the branch name to focus on what you wanted :
git branch -vv | grep <branchName>
Depending on what exactly you want to get, maybe also consider using the plumbing tool :
git for-each-ref --format="%(upstream:short)" refs/heads/<yourBranch>
and make it an alias for convenience
git config --global alias.get-rem '!f() { git for-each-ref --format="%(upstream:short)" refs/heads/$1; }; f'
# then just
git get-rem branch_name
Edit : For the very short part (i.e. "branch" instead of either "refs/remotes/origin/branch" or even "origin/branch"), you can use %(upstream:lstrip:-1) instead of %(upstream:short)

Can I get three-way word diffs using e.g. dwdiff with diff3?

I know there are GUI's that show word-diffs in three-way diff, and there are command line tools that show two-way word-highlighting diffs.
But is there a command-line way I can show three-way diffs with word-higlighting the same way that I can get two-way diffs word-higlighted with diff -u a b | dwdiff -u ?

git diff (which you can use outside a Git repository with --no-index, but that would be a two-way diff) does have word diff (--word-diff, with regex if needed, using --word-diff-regex).
See also color-words:
That would be a command-line way to get a word diff.

What is the algorithm git uses to find a commit by a partial sha-1 (at least first 4 characters)?

What is the algorithm git uses to find a commit by a partial sha-1 (at least first 4 characters).
Are there any implementations of such algorithm out there?

One very simple way (but ineffective) to find the full SHA1 given a partial "01234" one (a "short SHA1")is:
git rev-list --all --objects | grep ^01234
The actual way is:
git rev-parse --verify 01234
It is illustrated in commit 6269b6b
Teach get_describe_name() to pass the disambiguation hint down the
callchain to get_short_sha1().
So you can see the algorithm in sha1_name.c#get_short_sha1() function, which will looks in:
objects: find_short_object_filename(len, hex_pfx, &ds);
and in pack files: find_short_packed_object(len, bin_pfx, &ds);
(See "Git Internals - Packfiles")

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio