Visualising version history

I’m a redundancy fanboy. In visualisation, different formats suit different personalities and different tasks. With version control, the usual format is just a text log. This is good if you’re scanning for specific terms, but pretty ordinary for other activities – e.g. to get a feel for general trends that have arisen, the pace of change, or the rise and fall of specific contributors.

It’s encouraging, then, to see demos like the following, which shows the evolution of the Python language project (via Dion’s tweet).


code_swarm – Python from Michael Ogawa on Vimeo.

It reminds me of one of the first screencasts by Jon Udell, a fascinating walkthrough of the evolution of a wikipedia page over a year or so. The page he chose for this demo is as memorable as the message of the video itself.

These visualisations are cool as tasters for what might be, but they are “here’s one we made earlier”. Where are the tools to automate all this? I have no doubt such tools have been created in academic research projects, but let’s see them in action. I’d love to see the source code hosts – sourceforge, google code, github, et al – integrate this technology to produce visualisations on the fly.

Wikipedia as a Honeypot

How long until wikipedia becomes a honeypot?

“Who wants to be a millionaire” contestant is struggling to answer the question, “What year did the Fonz jump the shark?”, and calls out to Lifeline Buddy. Back in 2005, Lifeline Buddy would have googled for the answer. But this is 2007, and “wiki” is now a household name (the media refers to “wiki” and “wikipedia” interchangeably). Lifeline Buddy bangs out “fonz jump shark” into wikipedia’s search field and quickly finds the right page, reporting confidently the year was 1975. Only, it’s wrong; Fonzarelli, of course, jumped the shark in 1977. The producers had entered the fraudulent details at precisely the moment the lifeline was consulted. Contestant takes his $1000 consellation and exits, muttering something about the Britannica under his breath.

Producers wouldn’t stoop so low? If the BBC can do it, draw your own conclusion. The Register, in any event, would have a field day with their latest whipping boy.

Instead of restorting to restricting edits, wikipedia first needs to try out a “heat map” view to help people decide how stable the information is. Not as gaudy as the Ajax Patterns authoring heatmap (using a more subtle theme now), but some way for people to know what’s new and what’s old. Again, this comes back to the idea of separating out wiki content from presentation, ideally using some kind of web service. A wiki needs more than one view, even without any Maps/Flickr/Delicious mashup. For example, you could have three standard views:

  • Pure wiki reading, just like wikipedia today.
  • Stability view. e.g. Most content in white as now, but with a few shades of grey to distinguish how old each phrase is (darker grey = past minute, medium grey = past hour, light grey = past day; so a phrase “graduates” from grey to white as it matures).
  • Inspection mode. Full-on data mining interface, using Ajax (of course) to explore history, drill down to author info, etc.

Wikipedia as a Honeypot

How long until wikipedia becomes a honeypot?

“Who wants to be a millionaire” contestant is struggling to answer the question, “What year did the Fonz jump the shark?”, and calls out to Lifeline Buddy. Back in 2005, Lifeline Buddy would have googled for the answer. But this is 2007, and “wiki” is now a household name (the media refers to “wiki” and “wikipedia” interchangeably). Lifeline Buddy bangs out “fonz jump shark” into wikipedia’s search field and quickly finds the right page, reporting confidently the year was 1975. Only, it’s wrong; Fonzarelli, of course, jumped the shark in 1977. The producers had entered the fraudulent details at precisely the moment the lifeline was consulted. Contestant takes his $1000 consellation and exits, muttering something about the Britannica under his breath.

Producers wouldn’t stoop so low? If the BBC can do it, draw your own conclusion. The Register, in any event, would have a field day with their latest whipping boy.

Instead of restorting to restricting edits, wikipedia first needs to try out a “heat map” view to help people decide how stable the information is. Not as gaudy as the Ajax Patterns authoring heatmap (using a more subtle theme now), but some way for people to know what’s new and what’s old. Again, this comes back to the idea of separating out wiki content from presentation, ideally using some kind of web service. A wiki needs more than one view, even without any Maps/Flickr/Delicious mashup. For example, you could have three standard views:

  • Pure wiki reading, just like wikipedia today.
  • Stability view. e.g. Most content in white as now, but with a few shades of grey to distinguish how old each phrase is (darker grey = past minute, medium grey = past hour, light grey = past day; so a phrase “graduates” from grey to white as it matures).
  • Inspection mode. Full-on data mining interface, using Ajax (of course) to explore history, drill down to author info, etc.