Using Git as a "Poor Man's" Time Machine15 Feb 2010
Audience: Doesn’t cry when installing new software and using the command line, interested in reading a long rambling post about Garry Winogrand, Chuck Norris and, eventually, Git.
In recent versions of Apple’s OSX there has been a new feature called “Time Machine”. In short it allows you to step back in time to revisit your file system in previous states and copy files from those previous times. “Time Machine” is a simplified gui for a advanced filesystem created by Sun called ZFS. In this series of articles, we will attempt to mimic some of these features in our own creation, powered by Git.
Note: These directions are a little Windows-centric, though it should be easy to adapt them to any operating system
It Finally Happened
So the day has finally come all those hours and hours of blood, sweat and tears spent working on the All Important Spreadsheet, or perhaps it was a colossal document of your famous stamp collection. Maybe it was your latest digital masterpiece, a homage to Garry Winogrand’s “Park Avenue, New York” done entirely in MS Paint?
You’ve checked once…twice…three times and you are finally coming to grips with the fact that your file is missing. Well the hope is that its just missing, maybe it was misplaced, accidentally saved to a different folder. After a frantic search, nothing has turned up.
After the panic subsides you start thinking of what to do next. After searching in other folders, the next logical step would be to retrieve it from yesterdays backup (you are backing up right?). So you pull out yesterday’s backup and realize you have just lost hours of productive time. Not only have you spent all morning searching for the file, now you have to somehow recreate all the work that was lost. There has to be a better way!
You Silly Git
The Playschool definition of Git is as follows:
Super-duper undo for files - with cheatcodes. With its amazing branching, merging and other hoopla it is like the [Konami Code](http://en.wikipedia.org/wiki/Konami_Code) crossed with [Portal](http://www.youtube.com/watch?v=iFhPFSjNovA&feature=related) crossed with [Your Mom](http://beltespenner.com/oscommerce/images/i%20love%20your%20mom.jpg) (because she really does try to do whats best for you even if you don't understand it at the time)
The Git Website says the following:
> > ## Git is... > > Git is a **free & open source, distributed version control system** designed to handle everything from small to very large projects with speed and efficiency. **Every Git clone is a full-fledged repository** with complete history and full revision tracking capabilities, not dependent on network access or a central server.**Branching and merging are fast** and easy to do.
The short of it is this: git (and other distributed revision control software) let you record changes to your files in a meaningful way (you get to tag the changes with a name). Then you can arbitrarily pick and choose, roll back and forward, and branch and merge these changes. We are going to take the baby steps necessary to get you up and running with an “automatic” record (or commit) instead of the “real” way, which is to record these changes as you go.
This is not the intended use for Git. Git was created to be used by somewhat technical people, working independently, with text files. We will be abusing it by using it with/for non-technical people (well you might be a smarty pants but we are using a very dumb approach of “fire and forget” for the commits).
Also, we will be recording the pool (repository) of everyone’s work, normally each person would have their own copy of all the files to work on and then when they are done working all of everyones changes are merged into one repository.
Lastly, Git (and most other versioning software) is made to work on text files. “Why?” you may ask? Because, diffing (finding the differences between two different versions of the same file) text files is easy peasy, diffing binary files (images, word documents, programs, etc) is hard stuff. Every type of binary file has its own format. Which means whoever invented the format had their own vision how the bits should be ordered and what they really mean.
In some binary formats, to be more efficient, everything gets rewritten, not just the stuff that changed. For example, if you are working on an image and remove the background. If you compare the images side by side it is obvious to you what has changed. But to the computer all it knows is bits. So half of the file may be changed for more efficient storage. But I digress…
The point is that even though we are using Git in a way its creator hadn’t intended, it is flexible enough for the job. This is a testament to the philosophy of doing one thing and doing it well. Our project today is not an end-all-be-all it is merely a stop-gap for situations where you don’t have regular, consistent backups and/or you want some of the benefits of version control.
Enough Already, Let’s Get To It
You’re going to hate me. Only a little though.
In the amount of time you spent reading the above drivel you could have already implemented our little project. Well, we laughed, we cried, good times…
Anyway on to it! On to…
Building a “Poor Man’s Time Machine” with Git
It’s not really the worst ever but it’s no Tardis. We will be able to move backward in time, kinda forward-ish, depending on your perspective, and sideways as well. We are mostly going to be concerned with the preventing-the-JFK-assassination-and-returning-to-the-present-day-with-nothing-else-changed rather than the Bill-and-Ted-travel-back-in-time-and-totally-screw-with-the-present-errm-future-err-whatever-dude.
When everything is said and done you will be able to: see what files have been added or changed in the previous day (or hour, or whatever interval you choose) and be able to arbitrarily grab any previous version of any file.
What you need to download (assuming your are running Windows)
Git - Get the “full installer”
Git Windows Extensions - Kind of optional, but nice to play with
Happy Face Wallpaper - Because you’ll make it your background when your done, since you will be so happy
What you need to know
What directory you are going to run this on
How to run a scheduled task on your file server
The appropriate bribe for your local sysadmin if you don’t have admin rights
This will take up extra space on your disk, maybe a lot. Depending on the changes git may have to store a complete copy (or two or three) of any file that changes. It all depends on the number and type of changes
You will need to watch this like a little baby bird in a shoebox for a while. Eventually, you can probably let it fly on its own, but for now if you drop it out the window…well…the results won’t be pretty
I never said this was a great solution. If things start on fire, if everything gets deleted, if your server explodes. Well I’m sorry, I’ll send you a very sorrowful e-card, but thats about it. Your kinda on your own with this one
Time Machine Go
Okay, so install everything, I’ll wait…
Good now lets tell Git what folder to work on and “initialize” it. And no, don’t worry, “initialize” has nothing to do with “erase”. The easiest way to show you is from the command line, so roll up your sleeves.
During our example we will be using a folder called “SharedFolder”. This will represent the common file-share on our hypothetical server.
C:\>cd SharedFolder C:\SharedFolder>dir Volume in drive C has no label. Volume Serial Number is 18A3-D0C5 Directory of C:\SharedFolder 02/13/2010 12:12 PM <dir> . 02/13/2010 12:12 PM <dir> .. 02/13/2010 12:02 PM 5 FileOne.txt 02/13/2010 12:02 PM 5 FileTwo.txt 2 File(s) 10 bytes 2 Dir(s) 984,203,264 bytes free C:\SharedFolder>git status fatal: Not a git repository (or any of the parent directories): .git C:\SharedFolder>git init Initialized empty Git repository in C:/SharedFolder/.git/ C:\SharedFolder>git status # On branch master # # Initial commit # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # FileOne.txt # FileTwo.txt nothing added to commit but untracked files present (use "git add" to track)
In this example we have initialized the directory (telling git this is where we want to work). Since we have not added any files yet, we have not told git to actually track them. So now we will do just that.
C:\SharedFolder>git add FileOne.txt C:\SharedFolder>git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: FileOne.txt # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # FileTwo.txt C:\SharedFolder>git add . C:\SharedFolder>git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: FileOne.txt # new file: FileTwo.txt #
As you have seen we can use “git add” to be picky about what files we include. In advanced usage you can even tell git what part of which files to include. The last command “git add .” is a shortcut telling git to add every new file that has is not already being tracked.
Lastly, we are going to commit our changes to git. In effect, this is creating a checkpoint within git. Now any time in the future we can roll back to exactly this state, regardless of how many changes we have made, even if we have deleted the files entirely. As long as the hidden “.git” directory is there, all our changes are there too.
C:\SharedFolder>git commit -am "Initial Commit" [master (root-commit) 436162e] Initial Commit 2 files changed, 2 insertions(+), 0 deletions(-) create mode 100644 FileOne.txt create mode 100644 FileTwo.txt C:\SharedFolder>git status # On branch master nothing to commit (working directory clean) C:\SharedFolder>git log commit 436162e50d2075366634064793ef7ef8051da871 Author: unknown <[email protected](none)> Date: Sat Feb 13 12:12:41 2010 -0600 Initial Commit C:\SharedFolder>
Like the flux capacitor in Dr. Brown’s DeLorean, git is doing most of the work in our little “time machine”. Since, all of the hard work has already been done, there is only a small script we need to write to “steer” git.
cd C:\SharedFolder git add . && git commit -am "Daily Update"
Yep, that’s really all there is to it.
So go ahead and save this code as a batch file. Test it a few times. After you run it you should be able to do a “git log” and see a new revision (assuming that changes have been made).
At this point all there is left to do is to setup your batch file as a scheduled task and wait…
git init - Create a new git repository
git add SomeFile - Add SomeFile to the list of files that git will track
git add . - Add every untracked file to the list of files that git will track
git commit SomeFile - Commit the changes to SomeFile
git commit -am “My Message” - Commit all changes and tag them with the message “My Message”
git log - Review the past changes
In part two we will be discussing exactly what can you do with this wonderful contraption we have built. We will learn how to compare changes, see new files that have been added, and bring back old files that have been deleted or changed. See you…in the future.