Git, Snapshots, and Names
Central metaphor: Git = immutable snapshot DAG + movable names. Files are incidental; the core objects are snapshots and pointers. Basic operations like push/pull/rebase/merge/amend/fixup are just name moves plus new snapshots.
Motivation
Everyone has experienced the final_final_v2.zip problem. Shared drives and Dropbox give everyone access to the same files, but they don’t answer the deeper questions: who changed what, when, and why? How do you manage multiple versions of a file without getting confused? What if two people edit the same file differently? What if you need to go back, not just to yesterday’s version, but to a precise state from last week?
Git is designed to solve these problems. It isn’t a fancier shared folder. It’s a history machine, tuned for source code.
9.1 The central metaphor
An immutable directed acyclic graph (DAG) of snapshots.
A set of movable names (references) pointing into that graph.
Internally, Git’s object model (blobs, trees, and commits) exists solely to capture snapshots and link them together into a history. This design underpins everything else in Git.
Before diving into these object types, we should clarify what a “snapshot” means in Git. A snapshot in Git is not literally a photograph, nor is it simply the contents of a few files. It represents the entire state of your project at one moment: which files exist, how they are organized into directories, and what each file contains. Each commit captures one such snapshot, and a repository’s history is just a sequence of these frozen states.
9.1.1 Cast of characters
blob: the raw content of a file.
tree: the directory structure of a snapshot, mapping names to blobs or subtrees. (This is different from the “graph” in DAG: a tree here means a file hierarchy inside one snapshot, not the larger graph of commits.)
commit: metadata + a pointer to one tree + parent commits.
ref: a human-readable name pointing to a commit (like main).
HEAD: the special ref for “the branch I’m on now.”
index: the staging area for the next commit.
working directory (sometimes called working tree): your files on disk.
Snapshots are the durable objects. Names are how we interact with them.
9.2 Scene 1: Making a snapshot
The unit of history in Git is the snapshot. Each commit freezes the whole project state, and commits are tied together into a graph that shows how one state led to another.
In everyday language we talk about “changes” to files. In Git’s model, a “change” simply means that one snapshot of the project differs from another: a file has different content, a file was added, a file was removed, or the directory structure itself shifted. Commits record those differences implicitly by pointing to a new snapshot.
git add stages changes into the index.
git commit writes a new tree, then a commit pointing to it.
Finally Git moves a name (your branch) to the new commit.
o <-- main, HEAD | o | o
9.3 Scene 2: Undo and safety nets
Mistakes are inevitable. Git’s design lets you undo changes safely.
Because commits are immutable, “undo” does not destroy history. It moves names (branch pointers—
git reset: moves a branch name (and optionally the index and working tree) backward or forward. It rewinds history locally. Safe if the branch has not been shared.
git revert: makes a new commit that undoes the changes from an earlier commit. It never deletes history, so it is the right choice when you need to “undo” something on a shared branch like main.
git reflog: local log of where HEAD and branch refs have been.
git stash: temporarily save uncommitted work as hidden commits.
These features are the safety net. If you “lose” something, it’s usually just that the branch name moved away from it. The commit is still there, and git reflog can find it.
9.4 Scene 3: Branches as names
A branch is a pointer to a commit. Creating a branch just makes a new name.
When commits are added on that branch, the name moves forward.
Fast-forward merges are trivial: if one branch is already ahead of another, merging just moves the name forward.
9.5 Scene 4: Remotes
One of the most important aspects of Git and version control is collaborating with others.
A remote is a mapping to another copy of the graph, usually on GitHub.
origin: your fork, where you push.
upstream: the course repository, where you fetch from.
git fetch upstream moves the remote-tracking names (like upstream/main). I.e., it updates your local pointers like upstream/main to match the latest state of the upstream repo. Your local branch is untouched. Then git rebase upstream/main replays your work on top of the new base. Finally git push origin my-branch moves the name on your fork.
What does git pull do? It is shorthand for git fetch plus a merge (or rebase, if configured) of the remote branch into your current one.
9.6 Scene 5: Merging vs. rebasing
If two different snapshots try to change the same lines of a file, Git cannot combine them automatically. This is reported as a conflict, and you must edit the file to decide which version (or what combination) is correct. Once resolved, the merge continues as usual with a new commit.
Suppose development diverges:
A---B---C---E (main)
\
D (feature)Two ways to integrate:
Merge: make a new commit with two parents, preserving both branches’ histories (both lines of development).
Rebase: replay D on top of E, making a new D’. Produces a linear story.
Rebase: A---B---C---E---D'
Merge records what actually happened. Rebase rewrites history to look simpler. Which is better? Merges can clutter, but are safe. Rebases keep history tidy, but should only be used on private branches (not on commits others may already have).
How to remember: in a rebase, the branch you specify is the new base. Your commits are lifted up and replayed “on top” of it. If you can picture which history should be on the bottom (the base) and which set of commits should be placed above, the mechanics of rebase are easier to recall.
9.7 Scene 6: History hygiene
Commits should tell a story. Git has tools to adjust history before you share.
git commit –amend: adjust the most recent commit.
git commit –fixup <sha>: create a fix targeted at an earlier commit.
git rebase -i –autosquash: fold fixups and reorder history.
git push –force-with-lease: update a branch after rewriting, but with a safety check.
This process allows you to present a clean history in the end, without losing the benefit of your intermediate steps.
This flexibility in editing history is one of Git’s killer features. The promise we made back in the Motivation section—
9.8 Scene 7: Beyond the basics
GitHub adds layers on top of the raw graph.
Pull requests: a proposal to move a branch name on the shared repository (e.g. merging your feature branch into main).
Actions: automation that runs on each push (tests, deployments).
Git also has git bisect, a command that uses binary search over the DAG to find the exact commit where a bug was introduced. It is one of the “killer features” made possible by treating history as immutable snapshots.