cdybedahl | May. 21st, 2012

Most of the revision control systems that have been used in the Open Source world have followed a fairly similar pattern. You get the code from a repositry somewhere, you change it somehow and then you check it back into the repository. CVS works that way, Subversion works that way and Mercurial works that way, even if they don't all agree on much else.

git does things a little differently. More specifically, it does things a little differently when you want to shove your changes back into the repository. The way it's different is that it adds another stage. Instead of changes going directly from your working directory into the repository, with git you add changes to a staging area before you send on into the repository. The staging area is also known as the "index", but I'll stick with "staging area" here, since that name is much more descriptive of how it works and what it's for.

Let's try to illustrate with a little ASCII diagram. First, the Subversion flow:

Repository -> Working Directory -> Your editor -> Working Directory -> Repository

I hope you're with me so far. Now, the git flow:

Repository -> Working Directory -> Your editor -> Working Directory -> Staging Area -> Repository

See? Not so different. Now, what is this thing for, you may rightly ask. Surely they didn't implement it just to add another step to the commit process. That would be silly. And indeed they did not.

The staging area is where you assemble your commit before you actually create it.

Doing it this way gives you a lot more control of exactly what goes into a commit. Instead of just doing hg commit foo.c and hoping that includes all the changes you want and only those changes, with git you can do git add foo.c and then check that the staging area holds all the changes you want and only the changes you want before you send them on their merry way into the repository.

Having the staging area also makes it easy to do things that would otherwise be quite tricky and cumbersome. Imagine that you're happily hacking along implementing a new feature in some largish code base. While you're doing that, you happen to spot a bug in the code that has nothing to do with what you're working on. It's an easy one, needing only a couple of lines of changes to the code. And, since you're a conscientious coder, you want to add a test case for the bug. Plus documentation, of course. All those changes you want to commit separately from the stuff you're actually working on.

Now, you can do that with any system. You could check out a separate copy of the source and do the fix there. Or you could save a patch of all your working changes, revert to the last checked in state, make the bugfix, test and doc, commit them and then reapply the patch to get your working state back.

With git, you'd make the bugfix with its related changes, move only those changes into the staging area and then commit them. There's a command, git add -p somefile.pl, that'll go through all the diff hunks in the file and ask you for every one if you want it added to the staging area. There's also a more complex interactive mode if hunk-based is not enough control, but I've never personally needed to use it. Adding single diff hunks, though, I use maybe not daily but at least several times a week. Mostly for trivial stuff like fixing typos without cluttering up real commits with noise.

The staging area doesn't really let you do anything that's impossible to do in other systems. It just makes it a whole lot easier to do commits that are cleaner, and more likely to contain complete single logical units of work. And if you're something of a scatterbrain, as I can be at times, the ability to do git diff --staged to see exactly what I'm about to commit is invaluable. I don't know how many times it's saved me from doing commits that it would've taken some considerable time to undo.

You can bypass the staging area. git commit -a will do the same thing hg commit or svn commit does, send all the changes in the working tree as one commit to the repository. git commit that/one/file.py will create a commit with all the changes in that one file.

But it's been a very long time since I used either of those. Yes, adding stuff to the staging area before you create the commit is a bit of extra work. But it's a very tiny bit of extra work compared to how much better your commits will be because you do it.

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Calle's Journal, of sorts

People bad, Fire pretty

May. 21st, 2012

May. 21st, 2012

The git staging area

Profile

Navigation

July 2021

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags