git difference does not work properly after git clone - bash

I have a bash script in which I have to clone and get difference of some repositories.
I am trying to get difference between range of a date
git clone $repository
cd $path
git diff master#{2019-10-1}..master#{2019-10-14} -- package.json
but it shows error:
warning: Log for 'master' only goes back to Tue, 15 Oct 2019 09:51:16 +0000.
But this repository is old and has many commits.
when I do it locally on the machine in which I had cloned same repository some weeks back I get proper difference.
$ git diff master#{2019-10-1}..master#{2019-10-14} -- package.json
diff --git a/package.json b/package.json
index d29ffcb..8766fde 100644
--- a/package.json
+++ b/package.json
## -1,6 +1,6 ##
{
"name": "accountd",
- "version": "0.0.95",
+ "version": "0.0.102",
"main": "dist/src/index.js",
"private": true,
"scripts": {
## -8,7 +8,8 ##
"start-dev": "npm run build && nodemon . port=5000 stage",
"start-sand": "npm run build && nodemon . port=5000 stage",
- "test": "nyc --extension .ts --reporter=html --reporter=cobertura --reporter=text mocha -r ts-node/register src/**/*.spec.ts --exit",
+ "test": "mocha -r ts-node/register test/**/*.spec.ts test/**/**/*.spec.ts --exit",
I want the changes made in this specific file over a period of time.

The notation you're using (master#{<date>}) says that you want to refer to the version of master on <date> according to the local reflogs. That is, you're saying that you want to know what this particular clone's master ref pointed to on that date - not what of the current master commits had been committed on that date.
And git is telling you "this clone didn't have a master ref on that date".
To do what you mean, you first have to find the last commit before the "then' date, then diff against that. There are a number of ways, but something like
git diff $(git rev-list -n1 --before="<date>" --first-parent master) master
might be more what you want

The #{date} syntax will not get you what you want. I'd go into what you do want, but I got beaten to the answer; see Mark Adelsberger's answer instead. :-)
What to know about the #{...} syntax
With the #{date} syntax, it doesn't matter how old the repository source is. It does not matter how many commits you have. All that matters is what is in this particular clone's so-called reflogs.
Every Git repository—every clone—has its own reflogs. In general, no two different Git repositories will agree as to what goes into the reflogs. The reflogs are not meant for this kind of job. Which kind of job? This kind:
I want the changes made in this specific file over a period of time.
The reflogs tell you absolutely nothing about changes in files, or changes not in files, over time. They tell you about changes in your Git's references over time.
This may leave you puzzling over what, precisely, a ref or reference is. We probably should not go into too much detail here, but every branch name like master is actually a reference whose full spelling is refs/heads/master; every tag name like v1.2 is a ref whose full spelling is refs/tags/v1.2; and every remote-tracking name like origin/master is a ref whose full spelling starts with refs/remotes/. So they're just a generalization of the various names that humans tend to use, to talk about commits. What Git needs, to talk about commits, is their raw hash IDs. The names are, to some extent, just there so that us weak humans don't have to remember arbitrary, big ugly hash IDs.1
The key to understanding this, and the #{...} syntax, is to realize that these names—e.g., refs/heads/master—change over time. Right now, your master commit might be a123456.... Yesterday, it might have been some other commit, and tomorrow, it might be yet another commit. Your Git's master will change over time, and every time it does change, your Git keeps a record of what it was. This record only goes back so far: the commits are permanent, but the record of which hash ID master meant, when, is not. Moreover, it's not carried from one clone to another: Every clone's record of what master meant when is private to that one particular Git. In a fresh clone—which has all the commits2—there's only one value master has ever had, which is what it has right now.
Note that you can also use #{number}. This selects the number'th entry in the reflog.
To view the actual reflog for any particular ref, run git reflog ref, e.g., git reflog master. The git reflog command has various other sub-commands for dealing with the logs, too. See its documentation for details.
1The names do have other functions. For much more about this, see Think Like (a) Git.
2Assuming it's not a shallow clone, that is, and not using some of the new "promisor pack" features that are not yet in general use. Also, if you cloned with -b or --single-branch, or the upstream repository is set up in a less-usual fashion, you might not have a branch named master yet at all.

Related

How to count git commit numbers which contains filename (*_test.cpp)?

git log --grep "xxx" only search commit log while git log -- *_test.cpp only shows commit just contains *_test.cpp.
Is there a way to show commits which contains filename (*_test.cpp)? I'd like to count both commit1, commit2 and commit3 case.
commit1:
/path_a/file_a.cpp;
/path_a/file_b.cpp;
/path_b/file_a_test.cpp;
commit2: /path_c/file_b_test.cpp
commit3: /path_d/file_d_test.cpp
/path_e/file_e_test.cpp
Thanks.
Your title says count, your question body says show.
Count:
git rev-list --count # -- \*_test.cpp
list:
git rev-list # -- \*_test.cpp
show:
git log # -- \*_test.cpp
and I'd recommend adding either --no-merges or --first-parent to avoid redundantly listing every merge that forwarded changes along with the commit that made it.
git log -- *_test.cpp only shows commit just contains *_test.cpp
That's not how git log works for me, and its not how it's documented to work. Check your evidence, I think you're misunderstanding whatever led you to say that, for instance, in my git repo:
$ git log -10 --pretty=format: --name-only --no-merges # -- \*-*.c
refs/files-backend.c
builtin/cat-file.c
builtin/cat-file.c
fmt-merge-msg.c
t/helper/test-drop-caches.c
stable-qsort.c
builtin/update-index.c
read-cache.c
builtin/receive-pack.c
builtin/cat-file.c
builtin/rev-list.c
ref-filter.c
$
but adding the --full-diff option gives a much longer list of files touched by those commits.
I think using the ** wildcard will get you what you want:
git log '**/*_test.cpp'
** matches any number of directories. For more details, see the documentation.

How to only list .png files that have been modified in Git

How would one go about only listing the png files that have been modified in the current branch on Git?
My goal is to copy those files to a different directory (I need to send an email).
Suppose I have:
$ git status
On branch update_assessment_pt1
Your branch is up-to-date with 'upstream/devel'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: assessment/LWR/validation/HbepR1/analysis/hbepr1_plot.py
deleted: assessment/LWR/validation/HbepR1/doc/figures/AxialPowerProfile.pdf
deleted: assessment/LWR/validation/HbepR1/doc/figures/AxialProfile.pdf
deleted: assessment/LWR/validation/HbepR1/doc/figures/CladDisp.pdf
deleted: assessment/LWR/validation/HbepR1/doc/figures/FissionGas.pdf
modified: assessment/LWR/validation/HbepR1/doc/figures/FissionGas.png
deleted: assessment/LWR/validation/HbepR1/doc/figures/InterGasPress.pdf
deleted: assessment/LWR/validation/HbepR1/doc/figures/Mesh.pdf
deleted: assessment/LWR/validation/HbepR1/doc/figures/Power.pdf
modified: assessment/LWR/validation/HbepR1/doc/figures/Power.png
new file: assessment/LWR/validation/IFA_431/analysis/ifa431_plot.py
modified: assessment/LWR/validation/IFA_431/doc/figures/431_bol_rod_power.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r1.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r2.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r3.png
How would I go about getting the following, so I can copy those files?
modified: assessment/LWR/validation/HbepR1/doc/figures/FissionGas.png
modified: assessment/LWR/validation/HbepR1/doc/figures/Power.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431_bol_rod_power.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r1.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r2.png
modified: assessment/LWR/validation/IFA_431/doc/figures/431r3.png
Use git diff --cached --diff-filter=M --name-only to obtain these file names. Add -- '*.png' if needed to keep the list filtered to just *.png files—the command will list any to be committed file whose status is M (modified).
Things to know to keep this from just being a "use this magic command" answer
In text, you first called these modified in the current branch. This phrase doesn't mean any one specific thing. Fortunately you then went on to show git status output, where they were listed under Changes to be committed.
Git doesn't store diffs at all. Git stores snapshots—whole files, intact, inside the main unit of storage, which is the commit. That means that in order to see a change, you have to pick two commits: $old and $new. Git will extract both, then compare them. Whatever is different between commit $old and commit $new, Git will tell you about that. The actual change can be any of a number of change-status-es:
A means Added: the file is not in $old and is in $new.
M means Modified: the file is different between $old and $new. The difference could just be the mode of the file: executable, or not.
D means Deleted: the file is in $old, but not in $new.
R, C, T, and some other rare cases can also occur, though some of them may require extra flags to git diff: you won't see an R status unless you enable rename-detection, for instance. (Rename detection defaults to on in the most modern Git versions, but off in older Git versions.)
Using --name-status, git diff will show you the file names and status letters, instead of showing an actual diff. (Try this out to see.) The --diff-filter argument lets you tell Git: only tell me about files whose status meets the letters I pick.
Note, by the way, that the special name HEAD always means the current commit. It does not matter how you made this commit become the current commit, though one typical way is by using git checkout: you git checkout a commit by its hash ID, for instance, and that commit is now checked out and is the current commit. Or, you git checkout a branch name, and the tip commit of that branch is now out and is the current commit. There is always1 a current commit, and you can name it by writing the name HEAD in all uppercase.2
All of the above talks about comparing commits, but there are two other places that files can exist, that are not commits. Note that both of these places are temporary: they get wiped out by various operations, and once wiped out, cannot be recovered in Git: you have to copy from these temporary places, into actual commits, to make the files permanent. Once the files are in commits, they're frozen for all time, and can be restored to useful form in the future for as long as the commit itself exists (which tends to be "forever", or as long as the repository exists).
These two places are:
the index, which Git also calls the staging area or (rarely) the cache, and
the work-tree or working tree or any of several variants on this name.
Files that are in the index right now are ready to be committed. Every file that will be committed is in the index right now, even if the index copy matches the current (HEAD) commit copy.
You can, at any time, compare the HEAD commit to whatever is in the index right now. One command that does this is git diff --cached. For every file in HEAD and/or in the index, Git compares the two copies of the file. If they are different, the file is modified. If the index file exists but there is no such file in HEAD, the file is added. If the file exists in HEAD but not in the index, the file is deleted.
You can also, at any time, compare HEAD to the work-tree, or the index to the work-tree. The commands that do this are git diff HEAD and git diff (with no name). Again, for every file on the left-hand side (HEAD or the index), and every file on the right-hand side (in the work-tree), Git compares the two copies of the file.
Last, note that git status runs two git diffs. It does a quick git diff --cached to compare HEAD vs index. Whatever is different here, git status lists that file as to be committed. It also does a quick git diff (with no extra arguments except for --name-only) to compare index vs work-tree. Whatever is different here, git status lists that file as changes not staged for commit.
You wanted to compare HEAD vs index, so you want git diff --cached. You then wanted to list only those files that are Modified, so you can add --diff-filter=M. You didn't want to see the actual differences—nor even the status letters; file names only please!—so you can add --name-only. You also wanted only to list files whose name matches *.png, so add -- '*.png'—the quotes protect the * from the shell; we want Git to see the * so that Git can treat it as a pathspec—to get just those.
1Actually, this is really almost always. There's a special state in which HEAD exists and contains a branch name, but the branch name itself doesn't exist. This state mostly occurs when you create a new, totally-empty repository. Git requires a branch name like master to identify some existing, valid commit hash ID. There are no commits, so there are no valid hash IDs, so master itself is not allowed to exist. Nonetheless, HEAD holds the name master, so that Git will create the master branch when you make the first commit.
2On Windows and MacOS, you can sometimes get away with using head (lowercase) instead of HEAD (all-uppercase). This misbehaves if you start using git worktree add, so it's a bad habit to get into. If you don't like typing HEAD in all capitals, consider using the symbol #, which is a synonym for HEAD.

GitHub Branches: Case-Sensitivity Issue?

I seem to be having an issue with a repository continually recreating branches locally because of some branches on remote. I'm on a Windows machine, so I suspect that it's a case sensitivity issue.
Here's an example couple commands:
$ git pull
From https://github.com/{my-repo}
* [new branch] Abc -> origin/Abc
* [new branch] Def -> origin/Def
Already up to date.
$ git pull -p
From https://github.com/{my-repo}
- [deleted] (none) -> origin/abc
- [deleted] (none) -> origin/def
* [new branch] Abc -> origin/Abc
* [new branch] Def -> origin/Def
Already up to date.
When doing a git pull, the branches in question are capitalized. When I do a git pull -p (for pruning), it first tries to delete lowercased versions of the branches, then create capitalized versions.
The remote branches are capitalized (origin/Abc and origin/Def).
I have tried to temporarily change my Git config such that ignorecase=false (it is currently ignorecase=true). But I noticed no change in behavior. I'm guessing there's something local on my end that's currently holding onto those lowercased branches. But git branch does not show any version of these branches locally.
Short of completely obliterating the repository (a fresh git clone in a separate folder does not pull these phantom branches when trying pulls/fetches), is there anything I can do?
Git is schizophrenic about this.1 Parts of Git are case-sensitive, so that branch HELLO and branch hello are different branches. Other parts of Git are, on Windows and MacOS anyway, case-insensitive, so that branch HELLO and branch hello are the same branch.
The result is confusion. The situation is best simply avoided entirely.
To correct the problem:
Set some additional, private and temporary, branch or tag name(s) that you won't find confusing, to remember any commit hash IDs you really care about, in your own local repository. Then run git pack-refs --all so that all your references are packed. This removes all the file names, putting all your references into the .git/packed-refs flat-file, where their names are case-sensitive. Your Git can now tell your Abc from your abc, if you have both.
Now that your repository is de-confused, delete any bad branch names. Your temporary names hold the values you want to remember. You can delete both abc and Abc if one or both might be messed up. Your remember-abc has the correct hash in it.
Go to the Linux server machine that has the branches that differ only in case from yours. (It's always a Linux machine; this problem never occurs on Windows or MacOS servers because they do the case-folding early enough that you never create the problem in the first place.) There, rename or delete the offending bad names.
The Linux machine has no issues with case—branches whose name differs only in case are always different—so there is no weirdness here. It may take a few steps, and a few git branch commands to list all the names, but eventually, you'll have nothing but clear and distinct names: there will be no branches named Abc and abc both.
If there are no such problems on the Linux server, step 2 is "do nothing".
Use git fetch --prune on your local system. You now no longer have any bad names as remote-tracking names, because in step 2, you made sure that the server—the system your local Git calls origin—has no bad names, and your local Git has made your local origin/* names match their branch names.
Now re-create any branch names you want locally, and/or rename the temporary names you made in step 1. For instance if you made remember-abc to remember abc, you can just run git branch -m remember-abc abc to move remember-abc to abc.
If abc should have origin/abc set as its upstream, do that now:
git branch --set-upstream-to=origin/abc abc
(You can do this in step 1 when you create remember-abc, but I think it makes more sense here so I put it in step 4.)
There are various shortcuts you can use, instead of the 4 steps above. I listed all four this way for clarity of purpose: it should be obvious to you what each step is intended to accomplish and, if you read the rest of this, why you are doing that step.
The reason the problem occurs is outlined in nowox's answer: Git sometimes store the branch name in a file name, and sometimes stores it as a string in a data file. Since Windows (and MacOS) tends to use file-name-conflation, the file-name variant retains its original case, but ignores attempts to create a second file of the other case-variant name, and then Git thinks that Abc and abc are otherwise the same. The data-in-a-file variant retains the case-distinction as well as the value-distinction and believes that Abc and abc are two different branches that identify two different commits.
When git rev-parse refs/heads/abc or git rev-parse refs/remotes/origin/abc gets its information from .git/packed-refs—a data file containing strings—it gets the "right" information. But when it gets its information from the file system, an attempt to open .git/refs/heads/abc or .git/refs/remotes/origin/abc actually opens .git/refs/heads/Abc (if that file exists right now) or the similarly-named remote-tracking variant (if that file exists), and Git gets the "wrong" information.
Setting core.ignorecase (to anything) does not help at all as this affects only the way that Git deals with case-folding in the work-tree. Files inside Git's internal databases are not affected in any way.
This whole problem would never come up if, e.g., Git used a real database to store its <reference-name, hash-ID> table. Using individual files works fine on Linux. It does not work fine on Windows and MacOS, not this way anyway. Using individual files could work there if Git didn't store them in files with readable names—for instance, instead of refs/heads/master, perhaps Git could use a file named refs/heads/6d6173746572, though that halves the available component-name length. (Exercise: how is 0x6d m, 0x61 a, and so on?)
1Technically, this is the wrong word. It's sure descriptive though. A better word might be schizoid, as used in the title of one episode of The Prisoner, but it too has the wrong meaning. The root word here is really schism, meaning split and somewhat self-opposed, and that's what we're driving at here.
On Git, branches are just pointers to a commit. The branches are stores as plain files on your .git repository.
For instance you may have abc and def files on .git/refs/heads.
$ tree .git/refs/heads/
.git/refs/heads/
├── abc
├── def
└── master
The content of these files is just the commit number on which the branch is pointing.
I am not sure, but I think the option ignorecase is only relevant to your working directory, not the .git folder. So to remove the weird capitalized branches, you may just need to remove/rename the files in .git/refs/heads.
In addition to this, the upstream link from a local branch to a remote branch is stored on the .git/config file. In this file you may have something like:
[branch "Abc"]
remote = origin
merge = refs/heads/abc
Notice in this example that the remote branch is named Abc but the local branch is abc (lowercase).
To solve your issue I would try to:
Modify the .git/config file
Rename the corrupted branches in .git/refs/heads such as abc is renamed abc-old
Try your git pull
The answers supplied by nowox and torek were very helpful, but did not contain the exact solution. The existing references to remote in .git/config, and the files in git/refs/heads did not contain any versions of abc or def.
Instead, the problem existed in .git/refs/remotes/origin.
My .git/refs/remotes/origin directory had references to the lowercased versions of these feature branch folders. Some feature branches were made under abc and def using the lowercased versions, but they no longer exist on remote. The creator of these feature branches recently switched to using Abc and Def on remote. I deleted .git/refs/remotes/origin/abc and .git/refs/remotes/origin/def then executed fresh git pull -p commands. New folders, Abc and Def, were created, and subsequent pulls or fetches correctly display Already up to date.
Thanks to nowox and torek for getting me on the right track!
I did the following to solve my problem:
I navigated to the .git/refs/remotes/origin folder.
I deleted the folder with the buggy branch name.
I did git pull in the terminal.
I met the similar question today. I did the following to solve my problem:
rename the 2nd branch to another name
rename the 1st branch to 2nd_branch_old_name
git push origin 1st_branch_new_name

Git how to get hashid of last GitHub release tag

Every 10-20 commits I have a commit of the form "version uped to x.y.z" that is also marked a tag, since its a GitHub release point. Example below. I need to get hashid of last such commit so I can use it in a script like "git rebase -i $(hashid)", which is a point of freezing, where existing commits should not changed. There are 2 possible means to get it: search for last commit with message starting with "version uped" or search for last commit being a tag. I am not skilled with bash, so please assist.
dfd48cd (HEAD -> master, origin/master, origin/HEAD) Operator [:] for GreedyRange removed
b610256 Array GreedyRange docs updated
e6a1446 Embedded docstring updated
825bf83 moved gallery and deprecated_gallery to new folders
9414a55 Kaitai comparison schemas moved to a folder
61e9ccb Padded fixed, negative length check and docstring
ad6148c FixedSized updated, changed build semantics
979538d FixedSized NullTerminated NullStripped fixed, _parsereport and docstrings
4719d67 lib/py3compat updated, supportsintflag supportsintflag more accurate
9c164d4 makefile added xfails profile
672fefa (tag: v2.9.40) version uped to 2.9.40
In this example, output would be: 672fefa52b537c17f5ede90996b9156eb0e040ac
Here is one way:
git tag --sort -v:refname | head -1 | xargs git rev-list -n 1
Explained:
Get the list of all tags, sorted descending by version number
Pick the first one (the one with the highest version number)
Pass it to git rev-list to find the commit hash it references

'git diff' inconsistent between CLI and other clients

I'm trying to get a list of changed/added/deleted/etc. files for a commit in my Git repository. When I run the following in the shell, this is the output:
Indragie$ /usr/bin/git diff --name-status 0836
D INPopoverController.h
D INPopoverController.m
D INPopoverControllerDefines.h
D INPopoverWindow.h
D INPopoverWindow.m
D INPopoverWindowFrame.h
D Images/blue_progress_slice.png
M Images/next.png
M Images/pause.png
M Images/play.png
M Images/previous.png
D Images/progress_left_cap.png
When I check the list of changes in Xcode (or any other third party Git client), I see this:
Xcode diff http://cl.ly/2i3P3s0m0i3I10110h3E/Screen_Shot_2011-04-07_at_8.59.18_PM.png
Obviously these are just excerpts of the larger lists, but the point is that they are not the same at all. I've verified that the SHA1 hash of the commit I'm looking at is the same in both the CLI git and in Xcode. I'm new to git so there might be something fairly obvious I'm doing wrong, but even after pouring over man pages and git tutorials, I can't seem to find where I'm going wrong. Any help is appreciated.
Are you sure you're looking at the same things?
git diff <commit-id> will show you the differences between your current working tree and the tree at the time of that commit, not the changes introduced by that commit.
git show would show you just that commit's changes.

Resources