Add-On Collections: A Trail Framework for Firefox Goodies

Moz has begun Add-On Collections (via Ajaxian):

Today, we’re excited to introduce a new feature to our website that will expose the niche add-ons that can be hard to find, and gives users a more active role in helping outstanding add-ons bubble to the top. One thing we’ve learned as add-ons have grown in popularity over the years is that once a user finds an add-on they love, they become a fan for life. We see this all the time as people recommend add-ons to their friends and write great reviews. And we’re very happy to see so many bloggers writing about lists of their favorite add-ons.

Now this is interesting for a couple of reasons. Firstly, it’s an example of the URL Trail pattern; Firefox Collections is a framework allowing people to publish and share a collection of links; very similar to Amazon’s “So you want to …” ListMania. It adds the ability to do something that neither I, nor Vannevar B. mentioned about trails: the ability to subscribe to a trail, in order to see changes made, e.g. Here is the RSS feed for the Social Circuit collection.

The second hotness about Collections is the fact that Firefox add-on management is getting easier. You can click on a bunch of add-ons, like adding them to a shop cart, and install them all in one go:

It takes Firefox closer to where it should be IMO, in which add-ons are easily installed with one click, and distros are available for specialised needs, e.g. a developer build containing Firebug and other dev tools. Moz themselves might be wary of blessing certain extensions in this way; so the ideal situation might be if they were to provide a framework like Collections, where anyone can propose a bundle and the best bundles rise to the top, with a way to yoink a Firefox build containing all the add-ons in a collection.

As We May Think: URL Trails

As We May Think, Vannevar Bush’s legendary paper presaging hypertext, wikis, and the web, penned in 1945. Many features of his ficticious “memex” device have come to fruition on the web, but one thing I always find lacking is his concept of trails:

When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. Before him are the two items to be joined, projected onto adjacent viewing positions. At the bottom of each there are a number of blank code spaces, and a pointer is set to indicate one of these on each item. The user taps a single key, and the items are permanently joined. In each code space appears the code word. Out of view, but also in the code space, is inserted a set of dots for photocell viewing; and on each item these dots by their positions designate the index number of the other item.

Thereafter, at any time, when one of these items is in view, the other can be instantly recalled merely by tapping a button below the corresponding code space. Moreover, when numerous items have been thus joined together to form a trail, they can be reviewed in turn, rapidly or slowly, by deflecting a lever like that used for turning the pages of a book. It is exactly as though the physical items had been gathered together from widely separated sources and bound together to form a new book. It is more than this, for any item can be joined into numerous trails.

The owner of the memex, let us say, is interested in the origin and properties of the bow and arrow. Specifically he is studying why the short Turkish bow was apparently superior to the English long bow in the skirmishes of the Crusades. He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together. Thus he goes, building a trail of many items. Occasionally he inserts a comment of his own, either linking it into the main trail or joining it by a side trail to a particular item. When it becomes evident that the elastic properties of available materials had a great deal to do with the bow, he branches off on a side trail which takes him through textbooks on elasticity and tables of physical constants. He inserts a page of longhand analysis of his own. Thus he builds a trail of his interest through the maze of materials available to him.

And his trails do not fade. Several years later, his talk with a friend turns to the queer ways in which a people resist innovations, even of vital interest. He has an example, in the fact that the outraged Europeans still failed to adopt the Turkish bow. In fact he has a trail on it. A touch brings up the code book. Tapping a few keys projects the head of the trail. A lever runs through it at will, stopping at interesting items, going off on side excursions. It is an interesting trail, pertinent to the discussion. So he sets a reproducer in action, photographs the whole trail out, and passes it to his friend for insertion in his own memex, there to be linked into the more general trail.

Now at one level, any web page is a trail. Thanks to the magic of hypertext, any website is effectively a view onto the rest of the web, with the author being able to list links and comment on them. However, I don’t find much in the way of explicit support for this concept of trails.

Basically, I see a trail as a linear sequence through a bunch of resources, with the potential to annotate at each step. On the web, a resource is represented as a URL, so assuming everyone plays ball, building a trail is as simple as building a list of URLs, with some form of annotation for each. That’s the nice thing about URLs – it doesn’t matter if it’s a written article, a photo, or a video; they are all URLs. Furthermore, it doesn’t even have to be served via http; it could be a file on your hard drive (file:), an email adress (mail:), or an IRC channel (irc:). They can all live together in harmony in the same trail.

The closest popular thing I’ve seen to this is Amazon’s ListMania. My favourite ones are those beginning with “So you …” as in So you want to be a geologist or So you like to read about the Trojan War. These capture the spirit of Vannevar B’s tome; as a budding geologist, I might find a book or two on geology and see that list linked from the side, where I can get the (hopefully) expert opinion on the trail I can follow towards mastery.

There is also some interest around outlining, which is basically the trail concept when taken in a web concept, although you don’t see it being a big part of the web explicitly.

I was chatting to Jon last night and he refers to these as tour guides, and pointed me to The Nethernet, an unusual “game” that appears on websites you surf, based on Firefox plugins. It includes a concept of a “quest” where you can jump between websites, following a trail laid out by someone else.

All of this has been converging lately with several activities happening at Osmosoft. As a project in my spare time, I’ve been working on a tool to caption images. A trail of captioned images would be good as a slideshow. As a project in his spare time, Simon has been working on an excellent video player. The player runs through a playlist, which can be populated in various ways. Each of the videos is effectively a URL, so the playlist is effectively a trail. And in TiddlyDocs, one of our main projects at Osmosoft, a document is a hierarchy of sections, each being a tiddler sitting on a unique (TiddlyWeb-backed) URL. So the document’s table of contents is also a trail. Jeremy realised the implications of this early on, and we’ve begun talking with him about a richer format for the document spec, so it can include annotations and transitions between items. Note that it’s also a hierarchy rather than a list, but that’s fine because (a) a list is a subset of a hierarchy, so the scheme we arrive at will be able to degenerate into a list representaiton; (b) a hierarchy can still be traversed in a deterministic, linear, fashion – as you do when you read a book which is composed of sections, subsections, and so on.

So where we’re at now is deciding on the format for a trail like this. It will probably be JSON-based and be inspired by OPML at some level. Being a file format which will be baked into people’s content, it needs a little more upfront thinking than a coding exercise.

The URL Shortener as a Cloud Database

On URL Shorteners

URL shorteners are enjoying their 15 minutes of fame right now. They’ve been around since 2002, but became flavour of the month as soon as half of the planet decided to compress their messages into pithy 140-character microblogs, and there is money in it, driving a massive amount of new players into the market, which will ultimately lead to a massive amount of URL shortener induced linkrot. [Update Dec 2011 – I note that the URL shortener I used for a while, 3.ly, is now indeed linkrot :(.]

In passing, I will note the irony that long domain names were the flavour of the month a year ago. Although, maybe it’s not so ironic, since they enjoy a symbiotic relationship with the URL shorteners when you think about it.

Now, I recently realised that URL shorteners could be used as a form of cloud database. The URL is a form of data. And the interesting thing about this is that they form a cloud database that can be accessed from any Ajax app, because they (a) can be created anonymously; (b) offer JSONP APIs, in some cases (and with third-party bootleg APIs available in others); (c) allow you to store relatively long strings. Before you can say, “violation of Terms and Conditions”, I will get to that later on.

Character Limits

On (c), just how long can these URLs be? I did a little digging – gave them some huge URLs to convert using just the homepage of each service. I chose the top services from Tweetmeme’s recent study, minus friendfeed’s internal shortener, to come up with the four most popular services – tinyurl.com, bit.ly (my candidate for the first URL shortener to appear on the cover of Rolling Stone magazine, in case you ever doubted a URL shortener could be the in thing), is.gd (the one I’ve been using since it was a wee thing spouting three-character shortcuts), and tweetburner aka trurl.nl.

I was expecting them all to truncate at around 2083 characters, the traditional limit for IE. Boy, was I wrong!

I started playing around adding really long URLs, and playing a “Price Is Right” higher, higher, lower, higher game until I found out roughly the capacity of each.

Note that Bit.ly and Twurl.nl both give the impression they are storing more than their limits, i.e. they don’t show an error message, but instead they silently truncate the URL. Is.Gd does the right thing by telling you what it’s done. Although, the limits are weird – you would think they’d go for IE’s 2083 character limit, or be all binary and go for 2048, rather than cutting off at 2000. I guess 2000 is a simpler number to tell humans about.

So the most interesting one here is TinyURL. However, the actual underlying URL doesn’t work for some reason – the most characters I found that would work was 8192. However, the entire URL is stored, as you can see at the preview page.

A Legitimate, Related, Use: Shortening an Ajax Unique URL (with Fragment ID Reflecting App State)

The thought of using URL shorteners might sound crazy, useless, and a violation of terms, but it came to me for an entirely legitimate application, which is well within the T’s and C’s I believe. I’m creating a web app right now (very incomplete) where the entire state is captured in the URL. (see Unique URL. This saves me from having to set up any storage and (in some respects) makes life easier for users, who don’t to manage yet another account, log in, etc etc. It certainly lowers the barriers for new users if they don’t have to register in order to save things.

Saving the entire state in a URL can lead to a long URL. So with all the hype around URL shorteners, I figured why don’t I just let the user save it to a short URL, if they do prefer a short URL for mailing or writing down, or memorising (since some of these services let you specify the key). And so I might choose to build into the app a little “get short URL for bookmarking and tweeting” button. (Funnily enough, I would have previously called it “bookmark this”, but that would mislead users into thinking that the long URL on top isn’t actually a valid bookmark. Now that everyone understands URL shorteners, I can be more explicit about the button’s purpose.)

The short URL is effectively a holder of the entire state of this user’s application. In fact, this seems like an entirely valid reason to use a URL shortener, so I doubt it’s a violation of anyone’s terms. Worth noting incidentally that there are plenty of free images where you can anonymously upload 100K or more, so I doubt a 10K URL is a big deal; and given that the service receives link love and some useful tracking data, it’s probably just as valuable financially as an image sent to an image host.

A Pure Cloud Database

You can see where this is going. An extension to this thought is to simply treat the URL shorteners as cloud databases. As long as it looks like a valid URL, you could store whatever you like there. Turns out you can even store an image or a short MP3 as a data:// URI. I have no plans to do this, and I suppose it actually would be a violation of terms in this case, but it’s an interesting idea.

And if the URL was too long, you could always use a linked list structure —- break it up into several short URLs, with the last few characters of each source URL pointing the previous short URL. (it’s backwards since you can then be sure what URL was allocated, and you would distribute the last URL in the series).

  • http://tinyurl.com/mark3 end of the message mark2 (this is the URL you distribute)
  • http://tinyurl.com/mark2 middle of the message mark1
  • http://tinyurl.com/mark1 end of the message

There is actually prior art on this concept, I discovered – some anon poster recently created a proof-of-concept cloud DB, with encryption to boot. There were no replies to that post and it seems to have gone unnoticed, which is unfortunate. So allow me to dig it out:

In almost obvious violation of their terms of service (maybe not entirely, they technically are urls, just with random data tacked onto it.) I’ve created a way to securely store arbitrarily length data on URL shortening services like tr.im, bit.ly, tinyurl, etc.

You have to pass both the message and a key. The key is SHA-1’d and then the message is encryped with the key by AES-256. The message is split to 200 byte chunks and it loops through them. For each one, a special salt variable exists for no particular reason, is mixed with the key and a packet identifier number (part 0 = 0, part 1 = 1, so amazingly complex) and all of that is again SHA-1’d. It’s trunkated to 14 digits. The part of the data is prepended with a pseudourl. and the url is passed to the url shortener API and the 14 digit string is used as the custom short URL. The last packet is appended with a special last-packet marker.

http://jsbin.com/ixuda

As We May Think

All this makes me think what kind of JSON-based cloud services there should be on the web, that would indeed be explicitly designed for this kind of purpose and be more suited to the purpose. I bet you could build something nice along those lines with TiddlyWeb server.

The biggest restriction with all this is that the services are write-once. e.g. if you make a pretty poster, tinyurl it, and send the link, you can never change the poster someone will see when they visit that link (because the link directly represents the composition of your poster, rather than being a pointer to the composition on the server). So this heavily limits applicability of the concept anyway, but if users are willing to live with that restriction, it’s a big benefit in terms of simplicity. You could also overcome the restriction using some of the newer URL shortening services that let you log in and maintain your shortcuts. But that would (a) defeat the purpose of simplicity; (b) defeat the purpose of working in Ajax apps, since it would require privileged JSON calls, and privileged JSON calls are wildly insecure.

SVG and VML in One Chameleon File

Why a Chameleon File?

While most browsers support SVG, IE’s unique brand of interopability does not extend that far; even the latest and greatest, incarnation v. 8 of IE, has no sign of SVG. And so, we citizens of the innernets are left with two vector graphics formats: VML for IE, SVG for standards-compliant browsers, which I will simply refer to as “Firefox” for concreteness.

There are tools like Project Draw capable of rendering both SVG and VML, and there are convertors like Vector Convertor as well. You can easily end up in a situation where you have two separate files for the same image. In one’s endless quest for a simple life of zen-like existence, this is troublesome. You will have to mail the two files around or host the two files somewhere. And if someone changes one, they probably won’t bother changing the other one.

One solution would be to convert to a raster/bitmap file, i.e. GIF or PNG, which will then render fine in IE and standards-compliant browsers as well as many other places too. However, this isn’t always the best option: (a) if you want to support further editing, you will need the original vector data; (b) it won’t scale as nicely; (c) in some cases, the bitmap size is bigger by orders of magnitude.

So a colleague asked me how one could share this type of data and I got thinking about a recent experiment. I need to blog it separately, but the premise is that a single file can be different things to different people. In this case, we did some work yesterday I’ll describe here – seeing how a single file can look like VML to something that works with VML (IE) and look like SVG to something that works with SVG (Firefox et al). In the general case, I call this pattern “Chameleon File” and the particular kind of chameleon described here I call a “Vector Graphics Chameleon File”.

Demo

The demo link is below – it shows an ellipse in any browser, using either VML or SVG, whichever is supported:

http://project.mahemoff.com/vector/ellipse.xhtml

IE users are better off using http://project.mahemoff.com/vector/ellipse.html as it won’t launch automatically in IE as Windows doesn’t recognise the xhtml extension, though it will still launch once you tell windows to open it with IE. The content is the same, the URL is different. I’ll explain more below in “Limitations”.

(The example is taken from Wikipedia’s VML page.)

How?

How does it work? Let’s look at the source:

  1. <html xmlns:v="VML">
  2. <!--[if IE]>
  3. <style>v:*{behavior:url(#default#VML);position:absolute}</style>
  4. <![endif]-->
  5. <body>
  6.  <v:oval style="left:0;top:0;width:100;height:50" fillcolor="blue" stroked="f"/>
  7.   <svg xmlns="http://www.w3.org/2000/svg" width="100" height="50">
  8.     <ellipse cx="50" cy="25" rx="50" ry="25" fill="blue" stroke="none" />
  9.   </svg>
  10. </body>
  11. </html>

From Firefox’s perspective, the file is valid XHTML thanks to the .xhtml suffix, meaning that the svg tag is fair game. We use the old “if IE” comment trick to get Firefox to ignore the style rule; otherwise it will still work, but it will render the style tag’s content (this is XML, not HTML, which would have it un-rendered). It ignores the body and VML v:oval tag, and faithfully renders the SVG. In short, it does what the pure SVG does:

  1. <html xmlns:v="VML">
  2.  <style>v:*{behavior:url(#default#VML);position:absolute}</style>
  3. <body>
  4.  <v:oval style="left:0;top:0;width:100;height:50" fillcolor="blue" stroked="f"/>
  5. </body>
  6. </html>

From IE’s perspective, this is just a normal VML file with an svg tag snuck in, which thankfully for our purposes, it ignores. So IE sees the file as just regular old VML:

  1. <?xml version="1.0"?>
  2. <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
  3.  "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  4. <svg xmlns="http://www.w3.org/2000/svg" width="100" height="50">
  5.   <ellipse cx="50" cy="25" rx="50" ry="25" fill="blue" stroke="none" />
  6. </svg>

Limitations

Limitations, I have a couple of those.

File Extensions

The extension thing is really annoying (see also here). To get SVG working in Firefox, you need Firefox to see the file as XHTML. But to get VML working in IE, IE must see it as HTML. How do you make the browser “see the file” as something? You either set the MIME type in the HTTP header, or you set the file’s extension. In this case, we’re more interested in the file extension because we want to be able to just mail the file around – in which case there is no MIME type because there’s no HTTP header because there’s no HTTP because they’re viewing it off a file:/// URL – or we want to quickly stick it on a server and not bother faffing with .htaccess-fu and MIME types.

Now that being the case, what file extensions do we need? As far as I can tell, IE must have .html or .htm for the vanilla Windows operating system to open it.

As for Firefox, Firefox needs .svg or .xml or .xhtml as far as I can tell.

The problem here is there is no overlap – IE needs .html and .htm, Firefox needs .svg, .xml, .xhtml.

skitched-20090501-101849.jpg

I spent a while trying to coerce Firefox to see a .html file as XHTML using doctype and the like, but I can’t do it – any help here would be appreciated.

The consequence is that you have several possibilities: (a) call it .xhtml/.svg/.xml – it will run on Firefox and IE users will have to “open”, “choose application”, and set IE (and they can save that setting); (b) call it .html (or .htm but that’s just silly) and tell Firefox users to rename the file; (c) distribute two copies of the same file – defeats the purpose of simplicity to some extent, but since it’s still the same file, it’s not such a big deal; you can keep working with the one file until you’re ready to distribute it. Of (a) and (b), I prefer (a) because asking someone to “open with application X” is less onerous than asking someone to rename a file, which sounds a bit odd. On the other hand, in many enterprise situations, you have to optimise around IE users, in which case (b) may well be preferable. You could probably also ship the file with a little instruction to rename the file, targeted at non-IE users using CSS/Javascript hacks, which they will see on opening the file in its HTML guise.

SVG and VML Feature Sets

I haven’t experimented much with these, but I did find a larger SVG file that didn’t work properly. I’m hoping Project Draw will introduce an easy way to render both SVG and VML for the same file, which would let me script creation of SVG-VML chameleon files for experimentation; or I might just play aronud with a converter. Any data on this would be welcome.

The Chameleon File Pattern

… is rather interesting and useful. Some interesting analogies came up in discussion – one of them was the “old woman, young woman” optical illusion. I discovered it’s called a “Boring figure” after the experimental psychologist Edwin Boring. I thought “chameleon file” sounded more appropriate than “Boring file”!

Another analogy from Paul was the Rosetta Stone – same content in a different format. I think this is a good analogy, but at the same time, I guess the pattern could be used to contain different content too. That’s more the case with the JOSH Javascript-HTML Chameleon I’ll discuss later on.

It also reminds me of Zelig, an old Woody Allen flick about a “human chameleon” who tries to be all things to all people, although I think chameleon files have more of their own unique identity.

Chamelon File is a hammer, let’s go find some nails.

Thanks to my colleagues for ideas and inspiration.

Meeting with Aptana’s Kevin Hakman – Talking Jaxer

Regular readers of this blog will know I am a big fan of Javascript on the server. Through its Jaxer project, Aptana has been at the forefront of the server-side Javascript renaissance. As I’m in SF right now (mail or tweet me if you’d like to meet up), I was pleased today to be able to catch up with Kevin Hakman, a great talent in the Ajax world who runs community relations and evangelism for Aptana.

I wanted to get an overview of the state of Aptana and Jaxer, and also understand how it might be used in a TiddlyWiki context to help with accessibility.

We discussed a range of things Jaxer – I will list them below. Kevin reviewed a draft of these points and gave me some links and corrections.

  • Jaxer – as you probably know, Jaxer is an Ajax engine. It’s server-side Javascript, but more than that, you can perform DOM manipulation on the server, and then it spits out HTML from there. Very exciting for all the rasons I’ve mentioned in previous blog posts – using a single language, making proxied cals from client to server, etc. The great thing about Jaxer is it’s open source – you can download a version with Apache that “just works” – download, install, and start the server in one click and the sample apps are up and running. Aptana runs a cloud host for Jaxer (as well as Rails/PHP etc), so you can host with them if you don’t want to host it yourself. They have some interesting value-adds around their host, like support for different environments (dev/test/etc with different parameters, such as extra debugging), which you can easily spin out and turn off again.
  • ActiveJS – ActiveJS is a Jaxer-powered framework modelled on Rails and close to beta. The most mature part is ActiveRecordJS (ORM), but there is now ActiveView and ActiveController as well. As with Rails, you can use them independently. This is exactly what I’ve been hoping for and it sounds like it’s mature enough to start hacking against. Kevin is realistic about uptake and expects it to be a long process.
  • Microformats – Kevin directed me to this this Jaxer project which filters parts of a form depending on who is viewing it. You just use a different class name to indicate who can view what, and the server chooses which parts to output. This got me thinking about microformats and how Jaxer could help render them – with a single library, dynamically in the browser and statically from the server. There are a number of microformat libraries which might already be useful sitting in a Jaxer server.
  • TiddlyWiki – Taking TiddlyWiki based apps and using Jaxer to make them come statically from the server. Since Jaxer supports DOM manipulation, it should certainly be possible. Kevin said accessibility is certainly an argument for Jaxer, though there’s no significant demo to date, so a TiddlyWiki project would make a good demo. His intuition is that it will be a 90-10 – 90% straightforward, the remaining 10 will cause some complications. Would definitely make an interesting project to do a simple version, e.g. initially just get Jaxer to output each Tiddler, showing title and wiki markup.
  • Demo app – Aptana TV is implemented in Jaxer+ActiveJS and there are plans to open source the implementation and make it Jaxer’s “Pet Store”.
  • Interoperating – Jaxer can interoperate with other languages using browser extensions/plugins, since the server is basically Firefox; these are not extensions users have to install, but act as enhancements to the server. Just as a user can install a plugin to Firefox so web apps can make calls to Java for example, a developer can patch the server with the same plugin (or similar, but with the same API anyway?). Kevin said most of these kinds of components are XPCOM, but I guess Jaxer should accept regular Firefox extensions and even Greasemonkey scripts. DotNet COM example. In a simpler sense, you can call out to other languages using the OS command line. System commands have always been possible from the Jaxer server.
  • FF bugs – Jaxer shares the Firefox’s bug set and fixes, when you think about it!
  • Performance – Likewise, Jaxer benefits from all the stuff you read about sky-rocketing Javascript performance in the browser. Jaxer will be inheriting the performance boost from TraceMonkey promising to be silly-fast at 20x to 40x speed improvements for JS according to Moz.

Fun with Fragment Identifiers

I was recently invited to make a statement on WebMasterWorld about something in advanced Javascript at the edge of my understanding. I decided to cover fragment identifiers, something I have an odd obsession about:

There’s a lot of cool stuff going on right now, standards-based graphics and sound foremost in my mind. But I’ll focus here on something I’ve always found oddly fascinating: fragment identifiers and the quirky exploits around them. The humble “frag ID” is not exactly the pinnacle of the Ajax revolution, but it remains misunderstood and under-utilised by developers, and a feature which the browsers could do more to support. By design, fragment identifiers do just as their name suggests – they identify fragments of a document. Type in http://example.com#foo, and the browser will helpfully scroll to the “foo” element. However, they can be used for a lot more than that, thanks to peculiarities in the way they are handled by browsers.

Most importantly, by writing Javascript to change the fragment identifier, you’re able to update the page URL, without effecting a page reload. This fact alone means – contrary to a popular Ajax myth – you can support bookmarking and history in an Ajax application. Furthermore, fragment identifier information is never sent to the server, unlike the rest of the URL, which leads to some interesting applications related to security and privacy. Another powerful application is interacting between iframes from different domain, something which is theoretically “impossible” in older browsers and tremendously important for embedding third-party widgets on web pages. And when I began working at Osmosoft, I discovered yet another application: providing startup context to web apps, such as TiddlyWiki, which run off a file:// URL…as they’re unable to read conventional CGI parameter info.

I first researched this stuff for Ajax Design Patterns in 2005, and at that time, there were only a couple of scattered blog articles on the topic. Today, we can see the pattern in action in various high-profile places such as Google Docs and GMail. When I view my starred documents in the former website, for example, the URL is http://docs.google.com/#trash. That said, many developers still aren’t using this feature to its fullest, and library support is somewhat lacking. It’s encouraging to see IE8 has a new event for monitoring the fragment identifier, which suggests the story is not over for this rarely discussed property.

This is certainly a hot topic, if not well understood or covered; shortly after I wrote it, I was amused to see the topic showed up on none less than Techmeme:

Techmeme: Google's new Ajax-powered search results breaks search keyword tracking for everyone (getclicky.com/blog) - FireMoff (-;

The issue here is summarised in my statement: “fragment identifier information is never sent to the server, unlike the rest of the URL, which leads to some interesting applications related to security and privacy”. These bloggers had noticed that google was (in some cases) changing their URL structure for searches to use hashes (#q=term) instead of traditional CGI params (?q=term). Thus, when the user clicks through from Google to your web page, you don’t know what the user searched for.

Another thing that recently arose is how easily you can retain fragment identifiers in bookmarking type sites. For example, if you submit a URL to Digg, it annoyingly strips the fragment ID. Whereas Delicious retains it. TinyURL retains it too, but their php “create” script removes it, according to a conversation I had with Fred, who created this tinyURL bookmarklet. I can see why sites wouldn’t want to retain it, because it may well be just an in-page link, so it’s a tricky issue.

The main point being, fragment IDs – and design questions thereof – are alive and well.

Explaining the “Don’t Click” Clickjacking Tweetbomb

I just noticed some of my Twitter friends posting the following mysterious message:

“Don’t Click: http://tinyurl.com/amgzs6”

It turns out this is a Tweetbomb. If you go to that link and click on the button below, you end up tweeting the same thing:

… thus all your friends see it and some of them click on it and re-post it, and so on, thus propagating the message across the entire Twittersphere.

TinyURL shut down the redirect quickly and Twitter has responded, but the same attack could arise unless measures are taken. Of which, more later.

So how does this hack work?

The hack is an example of clickjacking. (I’ve heard the term a lot, but only understood its meaning after the investigation of this tweetbomb described here.)

Firstly, it’s using an iframe to embed Twitter.com on the page. The iframe is essentially invisible, due to the CSS structure…more on that in a moment. But first, looking at the source of the iframe, we spot our mysterious message:

<iframe src="http://twitter.com/home?status=Don't Click: http://tinyurl.com/amgzs6" scrolling="no"></iframe>

Well, I didn’t know about this before, but it turns out you can set an initial value for the status field by passing in a “status” parameter on the URL. So here, it’s setting the status to out mysterious message. You can try it yourself by visiting http://twitter.com/home?status=Don’t Click: http://tinyurl.com/amgzs6. (It’s safe to do so – nothing will get submitted, honest!) For the lazy and untrusting ;), what you’ll see at that URL is something like this, assuming you’re logged in to Twitter:

However, their job is not done yet, because they also have to trick you into clicking the “update” button. This is where more cunning comes into play. Check out the CSS for the iframe – containing your Twitter.com page – and the fake button contained on this trickster website:

  1. iframe {
  2.             position: absolute;
  3.             width: 550px;
  4.             height: 228px;
  5.             top: -170px;
  6.             left: -400px;
  7.             z-index: 2;
  8.             opacity: 0;
  9.             filter: alpha(opacity=0);
  10.         }
  11.         button {
  12.             position: absolute;
  13.             top: 10px;
  14.             left: 10px;
  15.             z-index: 1;
  16.             width: 120px;
  17.         }

And the elements:

<iframe src="http://twitter.com/home?status=Don't Click: http://tinyurl.com/amgzs6" scrolling="no"></iframe>

<button>Don't Click</button>

We can see from the CSS z-indexes, the iframe is on top of the button. And we can see from the iframe’s opacity that it is completely invisible. Hmmm…so it’s on top, but completely invisible. If there was a button there, you wouldn’t be able to see it, but you would still be able to click on it. This is where the layout comes in (position, width, height, top, left). The iframe is positioned just right so the the (invisible) update button appears in the top-right of the web page at (10,10), right where the fake button is. So the Update button is directly above the fake button but invisible. The website shows a fake button and tempts you to click on it, but clicking on it causes the real Twitter button – inside the iframe – to fire. Thanks to the URL status hack, you have just submitted their planted message. And of course, you couldn’t see the message, because the iframe is hidden and because that part of the iframe is not present anyway due to the negative left and top settings. There is also some filler text on the trickster page, below the fake button – I guess this is in case some browsers expand the iframe height if nothing else is on the page.

The main way Twitter can prevent this kind of attack is to use the old “bust the iframe” idiom. As I explained in Cross-Domain Communication with IFrames, it’s possible for an iframe to access the URL of its container, and actually re-set that value. Thus the common pattern:

javascript

  1. if (top.location.href != self.location.href)
  2.      top.location.href = self.location.href;

If Twitter had that code on its web page, whenever it was loaded inside an iframe, the user would see the entire page clear and refresh to become the Twitter URL that was in the iframe. I actually think it’s a shame they would have to resort to something like that, because there are valid uses for iframes, as WebWait demonstrates. Any site that uses this pattern cannot bathe in the warm glow of WebWait :). But I can see why sites need to take this measure. (Update: Twitter has updated their code to bust out of the cache, just after I posted this!)

Clay Shirky Talk – London ICA Feb 4, 2008

Just back from Clay Shirky’s talk. I’ve been following him for years, so it was great to see him live.

Notes taken on my phone, typo beware!

Today – launching softback edition of book and updating thesis since then (incl recent events includig Obama). “speculative”

“group action just got easier” So what happens when u lower the cost of doing these things

3 weeks ago – international taking off pants off on mass transit day!!! (and keeping a straight face) — the event had no special technology but was only made possible by technology

It’s tough to get a group to do anything … Traditionally required public sector (if social cause) of private sector (if profit motive) – pants off day , neither public nor private motives but cost so low it was possible to take place anyway

  • ryerson study group on facebook. Student charged, uni’s model is a newspaper publishing results and many students will be leeching.

students model-just a regular study group, updated for modern tech.

  • the thing facebook is most like … Is facebook! How do we update academic culture to take advantage of these possibilities. There’s no easy way out of consolidating these two mindsets. The changes will come bottom-up.

Thai fashion student. Article about fly fishing iphone game. If you ask why would she write it, answer is “she’s not speaking to you!” and why not wrote it, it’s cheap.

So then she was the first to post the coup. Everyone was watching and her next post was about a hello kitty iPhone!!!! She’s not a professional journalist and doesn’t want to be one, but she committed an “occasional act of journalism”. But an occasional act times a billion people …. !

Obama. In 2006 a black president seemed impossible. Always a sense he was noble cause but not going to get anywhere. The “yes we can”. Will I Am music video set in peoples minds the thought it might be possible.

The artist never asked obamas permission and nor was be asked to do it. But u could say the permission was there implicitly by certain things released as creative commons and also by the whole ethos of the campaign Which made lots of media available. In contrast mccain campaign wanted complete control eg provided quotes to cut and paste. plenty of obama media like that.

“sing for change” shows the republicans can play this game too…screw up by an obama supporter but no-one blamed Obama. Those days are past, when the organisation was held completely responsible for the actions of a supporter.

Obama site v functional, mostly for organising groups to ** take action **.

Group “pres Obama, pleaase get FISA right” groups popularity meant that obama had to respond.

The big question now is, will he govern like he campaigned. Change.gov launch the day after he got in …. Said “wisdom of crowds”

no. 1 issue was legalisation of marijuana. Clearly special interest group pressure.

Question and Answer

Q How to build org when lieutenants don’t want to, want command and control?
A One way is to Find pressure from outside eg Obama can try to rely on pressure from the media to stop senators negotiating insanity. “institutions at homeostatic”

Q Codes of conduct for social media in public spaces
A The “invisible college” – historic concept of practices of reporting and conducting research.

Q ??? A Doesn’t matter that people can game Digg because of just a media outlet and people can leave it and go elsewhere. but u can’t easily change countries. So still not solved how u can take voting seriously.

Q By can’t we learn from open source mechnanisms?
A Gov mechanism that works best is benevolent dictatorship. Linus, jimmy Wales, guido. Because there’s always the threat of forking of the project goes astray. Again, we can’t easily switch governments of move. Moving from the Uk to France is harder than switching from openbsd to freebsd.

Q New business models that will emerge from the current crisis?
A Sanger and Linus exceedingly modest when introducing their products (ie predictions – hard to say). P2P models for money (eg prosper.com – lending from small groups, typically people who know u and will watch u like a hawk). these models – based on emotional connection – may be more accurate than algorithmic models. “mutualisation”. The big questions whether states will support it.

Q Diff bn fisa and marijuana example.
A Fisa result – in a group that opted in, during a campaign. Once u have to govern u have to govern everybody.

Q Role for Holders of cultural content at museums.
A Smithsonian project uploading to flickr. Only got 6000 images out. The tech is available, the institutional change is the key concern. Every day, meetup.com gets someone in their design dept to watch a user working ok their website. Similarly museums etc should do the same thing,low key.

WebWait in the Wild

I haven’t blogged about WebWait since launching it almost two years ago. But it’s been quietly growing into quite a pretty well known site among certain communities in various languages. What I like seeing is people are actually getting value from it…it’s not just a novelty, but a tool for them. Here are some examples.

A conversation about blog timings … and soup recipes:

At December 7, 2008 6:24 PM, Blogger Arfi Binsted said… I love clear soup like this. Kalyn, for some reason, almost 2 months, my browser goes so slowly when I come to your blog. I am not sure where it comes from. Today, I try to open it from the link on my blogroll, it works! Just as well it is a click away from a miracle just to happen for Barbara :) hugs. At December 7, 2008 6:41 PM, Blogger Kalyn said… Maris, this is one of the best parts of blogging in my opinion. Arfi, thanks for telling me. I don’t know what’s going on because I use Webwait.com and check and my blog always seems fine there. I’m going to remember to check back with you and see if it is still a problem. I do hope you’re right that it’s a good omen!

A softly, softly, twitter encouragement for the TechCrunch boys:

liking http://webwait.com/ – http://fav.or.it took 3.04s – (pretty quick!) – @techcrunch took 13s!! – sort it out boys

A tech journalist compares browser speeds:

I used WebWait.com to test how quickly Chrome 0.2, Firefox 3, Safari 3.1, and Internet Explorer 7 loaded the InformationWeek.com home page. The results for three page loads averaged were: Firefox (5.21s) Safari (6.34s), Chrome (6.48s), Internet Explorer (8.90s).

A reviewer notes how much caching helps and I discover in the process I can almost grok Italian geek-speak:

La seconda volta grazie alla cache WebWait ci ha messo 5.17. Un tempo discreto direi.

A blogger produces a “webwait research report” – schweet!:

Ko nih, tanya sket pun tak boleh. Oklaa aku tunggu ko buat positioning tuh. Sambil2 tu jom buat research… :: WebWait Research Report :: Site: farisfakri.com: Average load time after 15 runs: 0.11s telescopictext.com: Average load time after 15 runs: 0.12s google.com: Average load time after 15 runs: 1.11s Blog: ladycoder.com: Average load time after 15 runs: 5.29s blog.farisfakri.com: Average load time after 15 runs: 8.12s soleh.net: Average load time after 15 runs: 9.02s noktahhitam.com: Average load time after 15 runs: 10.07s life4hire.berceloteh.com: Average load time after 15 runs: 11.99s kujie2.com: Average load time after 15 runs: 14.10s berceloteh.com: Average load time after 15 runs: 14.52s Terima kasih Yam, kerana memberi aku kerja. Laporan kajian aku mendapati feedjit, nuffnang, mybloglog, dsb adalah antara yang menjadi punca utama kelembapan loading sesebuah blog. (MM – Google Translate says “Thank Yam, because I gave work. I found the study report feedjit, nuffnang, blogspot, etc. are among the main source of humidity loading a blog.”)

WebWait is just one way to get an impression of speed, as the FAQ explains. And in cases like those above, it can give people a handy snapshot without relying on any browser-specific plugins.

People also love screencapping their webwait results, as this google images search illustrates. It would be nice to somehow make a gallery of those. Anyway, I’ve got some time off coming up, and one of my projects will be to make some long-overdue updates to the site, while ensuring it stays dead simple to use.