Skip to content

Instantly share code, notes, and snippets.

@drewcsillag
Last active February 19, 2016 21:13
Show Gist options
  • Select an option

  • Save drewcsillag/a2062aad075f0b69231a to your computer and use it in GitHub Desktop.

Select an option

Save drewcsillag/a2062aad075f0b69231a to your computer and use it in GitHub Desktop.
[ see https://github.com/drewcsillag/rebaseit ]
[note
gls = git log --format='format:%h : %s'
]
This is a theoretical example based off of real events.
So I'm in my git repo for my project and I start doing work.
Because git is awesome and I do stupid things, I commit *all* the
time. Between refactoring passes and tests passing, I just
commit, because editor undo only goes so far sometimes. Not only
that, but sometimes you don't find that messed something up along
the way until later, and it's good to be able to hunt down that
thing you changed that broke the world.
The problem with it is that you wind up with tons of commits,
that when it comes time to do the pull request, honestly no-one
wants to see. In many cases, you just want to squash it down to
one, but sometimes, a logical series of commits will be more
helpful to your code reviewers. The question is how to get there
from here.
Git is awesome and has a tool called interactive rebase which
allows you to slice, dice and make julienne fries out of your
commits.
So lets start with my fake project. I'm trying to bring up a new
kafka metric consumer. But I need to bring up Kafka, and then I
need to bring up Zookeeper because Kafka uses Zookeeper to track
stuff. I go through a series of hack and slash things to
eventually bring up a proper zookeeper cluster running the thing.
So let's see how I got there. [ go to git log ]
So starting from the last merge, I added the metricworker config
file and realized, duh, I need to setup kafka. Then while
writing the kafka config file, realized I need to bring up
zookeeper as well. I then realize, in kafka 8, that kafka
consumers need to know where zookeeper is too.
Ah, right, but when you have apps that use zookeeper, they
shouldn't store everything at the root, much like you wouldn't
have an app writing to the root directory of your hard drive, so
I then point kafka to a subdirectory in Zookeeper.
Unfortunately, I can't spell. Then I add the /kafka subdirectory
suffix to the metric worker configuration.... and I still cannot
spell.
Now that that all worked, I configure Zookeeper for a three node
cluster to see that it all works, and configure everything to use
that.
And so I wind up with 9 oddly-ordered commits that should really
be four: one for zookeeper, one for kafka, one for the metric
worker, and one for an unrelated metricworker configuration
change.
Ugg, so lets see what we can do. Oh, and I did all this in my
master branch. Whoops!
First, lets create a new branch to work in. We've already borked
some stuff up, so let's not make things worse.
So first we create the branch
---
git branch metrics
---
Now verify that it has what we think
---
git log metrics
---
Ok, lets fix our master branch back to what it ought to be.
If you don't know reset --hard will set git's index and
filesystem state to that at the commit you specify. Without
hard, it just resets the index state, but doesn't change anything
in the filesystem, which will become important later.
---
git reset --hard 07ceb00
---
let's just check that it's what it ought to be
---
git log -1
---
It is. Good. Ok, let's get the metrics branch fixed up.
---
git checkout metrics
---
Now this is where interactive rebase comes into play. Git
interactive rebase allows you to join, edit, and reorder commits.
So let's do it
--- don't press enter here
git rebase -i master
---
The master argument to rebase here is the commit before any of
the ones you want to fiddle with. If you want to think of it
this way, anything you fiddle with will be done on top of this
commit: that is, it'll start with master, then perform any of the
other operations we pick on top of that.
--
press enter
--
now you're presented with a list of commits, their brief commit
id and part of the first line of the commit message. For each of
these commits, you can do a few different things.
[scroll to show all of them]
I won't read them, because I assume you can read, but in our case
we want to selectively fixup (in rebase terms) some commits into
some others.
So now we have 4 commits, so far, so good. But that last commit
is a bit problematic.
---
git diff HEAD^
---
It's got 4 changes to three files. Three of those changes should
be merged into the existing commits, and the change to
metricworker.conf one should be broken out into two commits, one
to be merged into the initial add, and the part where it sets
queuesize to 20 should be in a separate commit. Oh, and I
misspelled server in the zookeeper conf as well.
Again, we do interactive rebase. Here, we're only fiddling with the
most recent commit, so we'll rebase from there.
---
git rebase -i HEAD^
---
Here we use the edit option. save and quit.
---
git log
---
Ok, here you see that last commit is still applied, but we want
to un-redo it.
---
git reset HEAD^
---
Remember what I said about git reset. Without --hard, it doesn't
change the filesystem state, just sets the git state to the
commit. So here, we're telling git to think that the git state
is before that last commit we want to fiddle with.
---
git diff
---
And so here we can see the now uncommitted changes.
Ok, lets make a new zookeeper commit (fixing the typo)
---
vi zookeeper.conf
---
Ok, now git supports a thing called autosquash which simplifies
doing those fixup things. So we want to ultimately add a new commit
that will be squashed into the first zookeeper commit.
---
git log
---
So we grab the first several digits of it's commit id. And do:
---
git commit --fixup XXXXX zookeeper.conf
---
---
git status
---
One down, two to go. Now we do the same thing for the kafka part.
---
git log
git commit --fixup XXXXXX kafka.conf
---
ok,
---
git diff
---
What do we do with this? This should be two separate commits,
one that sets the queue size, and one that fixes up the zookeeper
settings.
Well there's interactive add.
---
git add -i
---
Now it has a bunch of things you can do,
[show help]
but when I'm trying to do this, I want to patch
[press p]
you can choose different files for this, but for our purposes, we
only have just the one, so we press 1 here
[press 1]
and now we've selected all the files we want to tinker with, so we just
press enter to tell git we're done selecting.
Normally interactive add shows you a series of hunks that you can
say to stage or not. In this case, it shows us the whole change
as one hunk, which isn't the most helpful. Let's see what we can
do
[press ? and scroll up to see the whole thing].
Fortunately, we can split this hunk into two pieces as there's a
couple of unchanged lines between the two changes we want.
so we split it
[press s]
ok, we do want to stage this one
[press y]
and don't want to stage this one, as we'll do it separately.
[press n]
ok, we're done, so let's quit.
---
git status
---
And now, we can see that we have some staged changes and some unstaged.
let's commit
---
git commit -m 'set queue size to 20'
git diff
---
ok, and the leftovers we want to get squashed into the original
metricworker commit
---
git log
git commit -a --fixup XXXXX
git status
---
ok, let's exit rebase
---
git rebase --continue
git log
---
And now we see the fixup commits broken out, with the queue size commit.
Let's now squash all those things together.
Because we used the --fixup option in the commit, there's an
option you can pass when starting the rebase to have git "do the
right thing".
---
git rebase --autosquash -i master
---
Here, it already set the fixup commits in the right place.
I do want to fix the zookeeper commit message here since all the others
have 'added' in the past tense, but this has 'add', so I'll choose
'reword' on that commit.
---
git log
---
Now we see that we have the 4 commits we actually want!
----------------
Now, to err is human and sometimes you may do something stupid like this:
---
git rebase -i master
---
[delete the kafka commit]
---
gls
---
Oy. what do we do now?
Git has something called the reflog which can be your rescue. The reflog
is basically anything that has been the "all-caps" HEAD in your repo.
---
git reflog
---
generally when you screw up a rebase, you want to pick the commit
before the "rebase -i (start)"
[highlight it]
in this case it's this one. So we grab the commit id.
[highlight the one below]
---
git log XXXXXX
---
just check to make sure it's what we think it is.
lets grab the id of the kafka commit, as we'll need it in a bit.
now, in our situation, we could just do
---
git reset --hard XXXXX
---
to get back where we were, but sometimes, the lost commit goes a
bit back and you just want to pull that commit in. So you could do
---
git cherry-pick XXXXX
---
to pull the commit in
But I also realize now that the commits could be in a better
order. Logically speaking, you need to have ZooKeeper fist,
because everything else needs it. Next should be the Kafka
commit because the metric worker needs it, then lastly, you want
the metric worker commit. So let's rebase again.
---
git rebase -i master
---
plunk in the commit and reorder
---
gls
---
and there we are.
Sometimes during interactive rebase, I've screwed up something
while trying to edit a commit or somesuch. In such cases, I just
want to start the whole rebase over again. To do that just:
---
git rebase --abort
---
Now lastly, this top commit is something I only really wanted for
local testing, and is not something that should get pushed, but I
do want to retain the commit for things during code review and
such. How do I push the branch without the commit? Fortunately,
git's push command has some flexibility here.
If this is the first time the branch is being pushed, we do:
---
git push origin HEAD^:refs/heads/metrics
---
This says, push, starting from the previous commit to the metrics
branch. The refs/heads bit is only required if the branch doesn't
already exist on the server, otherwise you could just
---
git push origin HEAD^:metrics
---
Then we go to:
https://github.com/drewcsillag/rebaseit/commits/metrics
And sure enough, it doesn't have the queuesize commit like we
wanted it not to.
Any Questions?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment