Moving some commits to a feature branch but leaving others in the master branch - tortoisegit

I am not sure if I can do anything about this and it is not a huge hardship to leave it as it is.
I did try to fix things by following steps on other SO topics etc. and ended up loosing all my help revision commits and files.
Things are a little messy and I will try to explain. The history is OK up to a certain point:
Removing unused resource ID values from resource.h
Can you see that towards the bottom of the screenshot of the log?
Since that time, the majority of the commits are help file revisions:
Deleting help topics and redundant images
Revising help topics and images
Adding new help topics and images
But it gets complicated because with that big chunk of help revision commits I have some code change commits. Eg:
Added SetLoggingPath to CMSATools.
Revised CChristianLifeMinistryUtils::FillStudentsListBox method. Now it reads the students from the publishers database.
The plot thickens, for a small handful of them, eg:
Add Help menu to CPublishersDatabaseDlg.
In those cases the commit is a combination of code changes and help revision changes:
Added OnHelpHelp menu handler.
Started writing Help/HelpPublisherDatabase.html help topic.
The primary issue:
Beginning at this commit in the master branch: Removing unused resource ID values from resource.h can I make a new feature branch called: help-revisions and then, move the commits from master to the feature branch?
If it is possible, I am assuming we would need to move just the commits that are purely help revisions. I am not sure how to handle the commits that are a mixture of help changes and code changes.
So, ideally I am hoping to split out all the help revisions into a feature branch so that it can be merged in to the master and look better in the log. Leaving the code tweak commits alone in the master in an appropriate position.
The related matter is the cause of some of this. But I am not going to discuss that here after all.
As mentioned, I am just curious as to know if it is possible to improve the history I have as indicated.
I am a lone developer so do not have to worry about other individuals repositories.
Thanks for your help and time.
Update
I have given the rebase a go. I marked all the commits I wanted to split as edit. Then I started the rebase. I ticked edit/split and revised them as I needed until it completed.
Now my log looks like this:
Underneath, it looks like this:
So how do I get rid of that section? I have to fix that before I create the feature branch and do the cherry picking.
So, at the top I now have a new set of all the commits including the split ones.
Got it - did a force push of master branch.

This can be accomplished by a mixture of cherry-picking and rebasing.
Create a new feature branch which is before all your commits which are affected. Then select all commits you want to have on that new feature branch and select "Cherry-pick commits". After that you have a branch where only the selected commits are on.
Switch back to the previous branch and do a rebase on the parent of the newly created branch (you will need to enabled "force"). Now mark all cherry-picked commits again and select skip and start the rebase. Now, this branch does not contain the cherry-picked branches any more.

Related

How to manually specify a git commit sha?

This answer explains that normally a git commit SHA is generated based on various parameters. However, I would like to know: how can one specify a custom/particular/specific git commit sha (in Bash)?
For example, suppose one wants to create and push a commit to Git with the following sha:
1e23456ffd118db9dc04caf40a442040e5ec99f9
(For simplicity, assume one can assume it is a unique sha).
The XY-problem is a manual mirror script between two different Git servers. It would be more convenient to simply have identical commit SHA's than to keep a mapping of the commits between the Git servers. This is because the manual mirror is more efficient (saving computation time and server bandwidth) if I can skip certain commits from the source server. Yet that means the parent commits change in the target server, with respect to the same commit in the source server. In turn, that would imply the SHA changes, which would require me to keep track of a mapping of the sha's in the source and target server. In short, it would be more convenient to simply override the sha's of the commits to the target server, than to ensure the two servers have the exact same commits (for the few commits that are actually mirrored).
A commit SHA isn't just "normally" generated based on those parameters, it is by definition a hash of those parameters. "SHA" is the name of the hashing algorithm used to generate it.
Rather than trying to change the commit hashes, you should look for an efficient way to track them. One approach would be similar to how plugins like git svn work:
When copying a commit to the mirror, record the original commit hash as part of the new commit's commit message.
Possibly, since you're "skipping" commits in the original repo, each new commit should have multiple source hashes, since it will act like a "squash" of those commits.
Have a script which processes the result of git log and extracts these recorded commit hashes. This can then be used instead of the real commit hashes when determining what new commits to copy from the source.
However, make sure this is all worth it: if the eventual changes are all included, the chances are that git's existing de-duplication and compression will mean the overhead of the "skipped" commits is fairly low.
Since you've already outlined in your question that you have ways of handling your differences, I will assume this question is really and only this:
I would like to know: how can one specify a custom/particular/specific git commit sha (in Bash)?
And not "or do you have any other ideas that I could use instead".
And with that question, the answer is actually quite simple:
You can't.
Git doesn't just calculate the commit id because that's just a by-product of the implementation chosen. The way it is done is a core concept of how git is designed.
The commit id is calculated based upon the content of the commit, and this includes, as you have observed, the link to the parent. Change the parent but keep everything else identical, the commit id still changes.
This is core to how the distributed part of the version control system works, and cannot be changed.
You simply cannot change the id of a commit and keep the contents of it the same. This is by design
There has been some attempts at doing commit collisions by carefully constructing distinct commits that end up having the same id.
Here's such a successful attempt (collision): https://www.theregister.com/2017/02/23/google_first_sha1_collision/
First ever' SHA-1 hash collision calculated. All it took were five clever brains... and 6,610 years of processor time
I don't believe anyone yet have managed to take an arbitrary commit and then targeting a specific commit id with it. The collisions were carefully constructed by manipulating two commits simultaneously according to very specific criteria such that they arrived at the same id, but that id was not chosen by the researches.
TL;DR: It can't be done
The net effect of the collision(s) generated though is that Git will move away from SHA-1 at some point and go for a system that produces longer, and "more secure" (tm) hashes than what we have today. Since Git also wants to be backwards compatible with existing repositories, this work is not yet fully completed.
From the comment by CodeCaster, it seems I could use the freely choosable bits in the commit message in `git commit -m "some message" to ensure the sha of the commit ends up with a specific value.
However, based on the comment by Lasse V. Karlsen I would assume this approach requires non-linear computation resources. I did not go into detail in this, however I imagine/assume that as the commit history grows, the relative impact of the (limited (5mb) ) freely choosable bits of the commit message becomes smaller. I guess that could be an explanation on why leveraging these freely choosable bits in the commit message becomes costly.
So in practice, the answer seems to be: "You could (perhaps, if you spend a lot of computational resources), but you shouldn't.".
how can one specify a custom/particular/specific git commit sha (in Bash)?
One cannot. The commit hash is a value constructed, as you say, by hashing various values together, and the whole point is to uniquely identify a particular commit. You could commit the same set of files at a different time on a different machine and you'd end up with a different commit hash.
The way to ensure that you have the same commits on two different machines is to git pull (or similar) those commits from one machine to the other.
You don't necessarily have to move all the commits -- you could e.g. squash them or cherry-pick only certain commits.

How to find the set of non-merge changes in a TFS branch?

I know I can do a "compare" between two changesets and get a list of the changes made in the period of time between the changesets in question.
However, from that list I would like to exclude all changes that are the result of merge operations only (change types merge; merge, edit; merge, branch; etc.).
My goal is to get a list of what changes (edits, adds, deletes, ...) have been made within the particular branch, including to any files which have also had changes merged into the branch from other branches, without cluttering up the list with changes made in other branches and simply merged into my branch of interest.
How do I do that?
Getting a list of changes to a particular branch is quite easy. In source control, just press Ctrl-G. You can then filter on the branch and get a list of changes, and you can specify the change sets; then select Find. This will include merges though.
This may not completely solve your issue, but it will help I guess.
If you know about the source branches which would have merged to your branch then you can make use of the TF MERGES COMMAND which will give you the changeset numbers on when the merges happened.

Why to not commit changes to version control ... before

I'm novice developer, working alone. I'm using Xcode and git version control. Probably I'm not properly organised and doing things wrong, but I'm usually deciding to do commit just to make safe point before I'm spoiling everything. And at that moment I find it difficult to properly describe what I have already done, but I know exactly what I'm going to try next. So when I will do next reference point the previous is already named.
So my question is - are there some version control methodology where reference points are described by plans, not facts. Why this could be a bad idea?
The problem with describing a commit based on what you "plan" to do is that you lose accurate accounting of what has been done. Let's say you plan on doing something, but that doesn't work. So you roll back and try something else, and that works. You commit that, but now what you "planned" to do isn't what was actually done.
At that point, you'll need to go back and edit the comments on the previous commit to describe what you actually did or risk losing a record of the change over time. Also, if you are working in a group, you pretty much need to make your comments based on what you actually did so other members of the team can see it and either check what you did or improve on it.
Unless you plan on never working on a team project, your best bet is to just bite the bullet and figure out how to keep track of what you've done since the last commit. I keep a pen and notepad by my side so I can keep track of changes. I also do frequent commits to keep from forgetting what I've done over a long period of time.
ABC, always be committing. While you may be working on projects for yourself an no one is accountable but yourself, it is generally a good idea to commit what has been done rather than what you plan to do.
Branching is designed to save yourself from what you plan to do. Create a branch called 'addnewscreen' or whatever you plan to do. This way you can keep committing all the small changes on your new stuff without polluting your main branch. Once you are happy, merge it back in and make a new branch for what's next.
If you get stuck, the Pro-Git Book has helped me so many times I've lost count. Hopefully this will help you too. Good luck.

What are the advantages of a rebase over a merge in git?

In this article, the author explains rebasing with this diagram:
Rebase: If you have not yet published your
branch, or have clearly communicated
that others should not base their work
on it, you have an alternative. You
can rebase your branch, where instead
of merging, your commit is replaced by
another commit with a different
parent, and your branch is moved
there.
while a normal merge would have looked like this:
So, if you rebase, you are just losing a history state (which would be garbage collected sometime in the future). So, why would someone want to do a rebase at all? What am I missing here?
There are variety of situations in which you might want to rebase.
You develop a few parts of a feature on separate branches, then realize they're in reality a linear progression of ideas. Rebase them into that configuration.
You fork a topic from the wrong place. Maybe it's too early (you need something from later), maybe it's too late (it actually applies to previous versions as well). Move it to the right place. The "too late" case actually can't be fixed by a merge, so rebase is critical.
You want to test the interaction of a branch with another branch, but for some reason don't want to merge. For example, you might want to see what conflicts crop up commit-by-commit, instead of all at once.
The general theme here is that excessive merging clutters up the history, and rebasing is a way to avoid it if you didn't get your branch/merge plan right at first. Too many merges can make it hard for a human to follow the history, and also can make it harder to use tools like git-bisect.
There are also all the many cases which prompt an interactive rebase:
Multiple commits should've been one commit.
A commit (not the current one) should've been multiple commits.
A commit (not the current one) had a mistake in it or its message.
A commit (not the current one) should be removed.
Commits should be reordered (e.g. to flow more logically).
While it's true that you "lose history" doing these things, the reality is that you want to only publish clean work. If something is still unpublished, it's okay to rebase it in order to transform it to the way you should have committed it. This means that the final version in the public repository will be logical and easy to follow, not preserving any of the hiccups a developer had along the way.
Rebasing allows you to pick up merges in the proper order. The theory behind merging means you shouldn't have to worry about that. The reality of resolving complicated conflicts gets easier if you rebase, then merge new changes in order.
You might want to read up on Bunny Hopping

Does TFS lose its link when you move a branch?

My co-worker is trying to merge his development branch back into the baseline. Even though he only modified a couple files, all files in the baseline are being checked out for merging. As if it's a baseless merge. What gives?
I don't experience this and the only difference I can see is that I branched directly from the baseline and he made a branch and then did a "move" on the branch. Does moving a branch mess up the link back to the baseline? He is still able to select the baseline in the GUI so I don't think it's doing a baseless merge since that's only available via command line, but it's behaving like that.
Anyone got some insight or know what else we should check?
This is by design. TFS needs to mark the changeset where you moved the source branch as "already accounted for" so it's no longer a candidate next time you merge.
Merge history is recorded at Checkin time by updating all of the pending changes that have their Merge bit set. Ordinarily, this is accompanied by other change types like Edit, Delete, etc. If not, it's just a recordkeeping transaction like the case you've encountered. (there are other cases) No files will be modified by checking in the "no-op" merges.

Resources