What’s New in On-Demand Javascript? ScaleCamp Presentation

These are the slides from my ScaleCamp presentation, covering various techniques for various styles of On-Demand Javascript, including Script Islands, “async” and “defer” attributes, and library support.

One of the interesting things at ScaleCamp is that many of the folks there are dealing with third-party advertisers and analytics providers. This leads to very different on-demand Javascript drivers from those of us mostly dealing with enterprise web apps. If you’re a small fish in a big pond, you’re pretty much held hostage to the script tags of leading advertisers and trusted analytics companies in your industry; if they serve the script slowly, it blocks your page from loading and therefore messes with user experience. That’s why some of these new techniques are so critical, notably the dynamic async script tag technique the Google Analytics came out with at a very timely moment, just a couple of days before this presentation.

Embedded Images in TiddlyWiki Under IE6 via MHTML – Proof-of-concept

tiddlywiki mhtml images (by mahemoff)

I only came across the MHTML image hack over the weekend, while listening to @jeresig on the new jQuery podcast (incidentally not the only jQuery podcast to be launched in the past week or two).

The MHTML image hack lets you embed images with text, just like ye olde data: URI hack, but in a way that works for IE6. MHTML is MIME for HTML.

Of course, I immediately wondered if it could work in a single-page web app like tiddlywiki, and it turns out it can, though my quick exercise still has some problems.

IE6 TiddlyWiki images demo here

As I wrote on the demo itself:

Normally, images are contained in a separate location, pointed at from HTML IMG tags or from CSS background-image properties. However, Tiddlywiki is a style of web app where everything resides in one file. So how to include images?

The usual hack is to embed data: URIs (http://en.wikipedia.org/wiki/Data_URI_scheme). However, no go for IE6 and IE7. Hence, a “newer” technique – newer meaning recently discovered. That is, MHTML (http://www.phpied.com/mhtml-when-you-need-data-uris-in-ie7-and-under/).

I was curious if MHTML worked in single-file HTML pages, reading off a file URI, and to my surprise it does. That said, it’s not perfect at all. Firstly, I had to hard-code the location, because I don’t know how to refer to “the current file” within the MHTML link. (I suppose a workaround would be to output the image file and refer to that with a relative URL, but we lose the benefit of everything being in one file.) Secondly, I played around with various base-64 images and this arrow one (from the phpied.com demo) was the only that worked :(.

So it’s a proof-of-concept with many gaps left for the reader to fill!

Hopefully people play with this further. At 3002 days old and counting, your grandpa’s browser isn’t going anywhere fast.

Fun with Fragment Identifiers

I was recently invited to make a statement on WebMasterWorld about something in advanced Javascript at the edge of my understanding. I decided to cover fragment identifiers, something I have an odd obsession about:

There’s a lot of cool stuff going on right now, standards-based graphics and sound foremost in my mind. But I’ll focus here on something I’ve always found oddly fascinating: fragment identifiers and the quirky exploits around them. The humble “frag ID” is not exactly the pinnacle of the Ajax revolution, but it remains misunderstood and under-utilised by developers, and a feature which the browsers could do more to support. By design, fragment identifiers do just as their name suggests – they identify fragments of a document. Type in http://example.com#foo, and the browser will helpfully scroll to the “foo” element. However, they can be used for a lot more than that, thanks to peculiarities in the way they are handled by browsers.

Most importantly, by writing Javascript to change the fragment identifier, you’re able to update the page URL, without effecting a page reload. This fact alone means – contrary to a popular Ajax myth – you can support bookmarking and history in an Ajax application. Furthermore, fragment identifier information is never sent to the server, unlike the rest of the URL, which leads to some interesting applications related to security and privacy. Another powerful application is interacting between iframes from different domain, something which is theoretically “impossible” in older browsers and tremendously important for embedding third-party widgets on web pages. And when I began working at Osmosoft, I discovered yet another application: providing startup context to web apps, such as TiddlyWiki, which run off a file:// URL…as they’re unable to read conventional CGI parameter info.

I first researched this stuff for Ajax Design Patterns in 2005, and at that time, there were only a couple of scattered blog articles on the topic. Today, we can see the pattern in action in various high-profile places such as Google Docs and GMail. When I view my starred documents in the former website, for example, the URL is http://docs.google.com/#trash. That said, many developers still aren’t using this feature to its fullest, and library support is somewhat lacking. It’s encouraging to see IE8 has a new event for monitoring the fragment identifier, which suggests the story is not over for this rarely discussed property.

This is certainly a hot topic, if not well understood or covered; shortly after I wrote it, I was amused to see the topic showed up on none less than Techmeme:

Techmeme: Google's new Ajax-powered search results breaks search keyword tracking for everyone (getclicky.com/blog) - FireMoff (-;

The issue here is summarised in my statement: “fragment identifier information is never sent to the server, unlike the rest of the URL, which leads to some interesting applications related to security and privacy”. These bloggers had noticed that google was (in some cases) changing their URL structure for searches to use hashes (#q=term) instead of traditional CGI params (?q=term). Thus, when the user clicks through from Google to your web page, you don’t know what the user searched for.

Another thing that recently arose is how easily you can retain fragment identifiers in bookmarking type sites. For example, if you submit a URL to Digg, it annoyingly strips the fragment ID. Whereas Delicious retains it. TinyURL retains it too, but their php “create” script removes it, according to a conversation I had with Fred, who created this tinyURL bookmarklet. I can see why sites wouldn’t want to retain it, because it may well be just an in-page link, so it’s a tricky issue.

The main point being, fragment IDs – and design questions thereof – are alive and well.

Chasing super

A lot of discussion around object-oriented Javascript involves finding cunning methods to get a super reference. This is sometimes a reference to the super class, the super instance, or the super version of the current method. The latest installment in an Ajaxian posting this week on work by Erik Arvidsson. It makes for an interesting enough intellectual exercise and I have full respect for anyone who can pull these things off. But as Erik himself has commented, is it really worth it? (“Once again I find that adding code to make things simpler in JS is unsuccessful. The costs for adding better ways to do things don’t pay for itself. (Except for the inherits function of course.”)

In the case of super…how useful is super – any of the types of super – in practice? In a more general sense, it seems that most discussions of OO and inheritance in Javascript focus more on mimicking the features of typical OO languages and less about how they are actually used. We’re thinking more about the design patterns of implementation – how to “do OO” – than the much more important design patterns of application.

super isn’t very useful for two reasons:

  • Inheritance is overrated.

    Inheritance is often considered the killer app of OO. In fact, the killer app is encapsulation – combining data and behaviour in a single model. Inheritance is a great feature, but it’s icing on the cake compared to the magnificence of encapsulation. Also, as the uber-uber GoF patterns book emphasised, delegation often trumps inheritance. In enterprise Java-land, this has now been firmly entrenched by the dependency injection “revolution”. The idea is basically small dumb things, loosely connected, just like the Unix philosophy and that of Web 2.0. You have a small Strategy class that does just that, and you can inject it into many and varied classes. You just wire up the dependencies, outside of both classes. What you don’t have is a whopping big monolithic class with loads of subclasses that each have their own subclasses doing different things. The delegation model relies on contract inheritance, i.e. Interfaces in Java terms, but not on behaviour inheritance. Inheritance is often confused for design-by-contract, especially in languages like C++ which don’t have an explicit Interface construct.

    The popularity of tools supporting delegation and dependency injection, Spring in particular, means many developers are learning this principle by sheer osmosis if not explicitly. Likewise, the duck typing of languages like Ruby and Python – and, notably for our purposes, Javascript – means you can do this stuff well without any special frameworks. Furthermore, even with Ajax starting to reach some level of maturity, most Ajax apps are orders of magnitude smaller than those enterprise apps whose complexity is a key motivation for inheritance. Inheritance? Good, very good…but overrated.

  • super is a code smell.

    For those occasions when inheritance is appropriate, super still remains inappropriate in most cases. If you take a gander at the aforementioned GoF patterns, you’ll see that most inheritance-related patterns rely on the superclass calling particular methods on the subclass (usually protected methods). These are methods the superclass has explicitly defined and knows about. As long as the protected method fulfills its contract correctly, everything works nicely. There’s no need to call super.

The more fundamental point here? Javascript ain’t Java! Every language is unique.

WordPress “Edit This” Links via Ajax

Working on a WordPress customisation recently, I added an “Edit This” link which only logged-in people can see. To get caching right, the server always outputs the same thing – an invisible link – and only in the browser does the decision get made to show the link or not. This exemplifies the pattern I only ever identified as “Ajax as a Remedy for the Cacheability-Personalization Dilemma”, but which I will now call “Browser-Side Personalisation”.

In this case, the personalised content is not secret, so it’s fine to output it in the HTML, and simply make it invisible by default. And it’s not mission-critical either, so the app degrades nicely if Javascript isn’t present – it simply won’t be displayed and the user will have to go into the “Manage Posts” area and look up the post from there.

To output “Edit This” from the server, within a WordPress loop:

javascript
< view plain text >
  1. <div class="editPostLink"><? edit_post_link() ?></div>

Links of this nature are rendered invisible within layout.css:

  1. .editPostLink { display: none; }

… but we switch them on if the user is logged in:

javascript
< view plain text >
  1. var loggedIn = /wordpress_logged_in/.test(document.cookie);
  2.   if (loggedIn) $(".editPostLink").show(); // it's easy with JQuery

Offline Sound: No Flash, No File

I just did some tinkering with offline sound and it turns out you can embed an audio clip in an HTML file, and play it without using Flash. I could have demo’d this in a raw HTML file, but it was just as easy to stick it in a Tiddlywiki file. So I made a TiddlyWiki called JinglyWiki. Click on the button and it will play a sound. If you grab the file (safest using a tool like Curl or wget, as the browser will sometimes Save As something different), you can run it offline and it will still play the sound.

JinglyWiki started after an office conversation yesterday. We had a flaky wifi connection and were pleased to be able to play a beat on http://instantrimshot.com to signify a re-connection. This got us thinking about playing beats in Tiddlywiki. Jeremy mentioned the possibility of data: URIs, and I thought back to Reinier’s great effort in playing sound without Flash. I wondered if it would work with data: URIs and to my pleasant surprise, it did. At least in FF3, which is all I’ve tested so far. I also forgot how tiny the code is to play audio without Flash – you just add an element to the page.

The point of JinglyWiki is you can stick a 300K file on your local file system or a USB stick and play gratuitous sounds without being online. Your iPod cost you several hundred bucks, whereas JinglyWiki is entirely free. Free as in you can spend your money on beer instead. JinglyWiki may well be the best portable music player you never paid a penny for. Groovy.

It would be so much more useful with some code to dynamically generate WAVs, but that’s a project for another day.

As for the name, it’s a play on the much more important project Phil Hawksworth has initiated to build a JQuery based Tiddlywiki framework: JigglyWiki. I think this nascent effort is going to produce some wonderful plugins for the JQuery community and be an important enough project to derive novelty joke names out of :), so I’m setting the trend here.

The data itself is 15KB for about a second of voice content.

Here’s the plugin code. Trivial or wot?!!!

javascript
< view plain text >
  1. config.macros.jingle = {}
  2. config.macros.jingle.handler = function(place,macroName,params,wikifier,paramString,tiddler) {
  3.  
  4.   button = document.createElement("button");
  5.   button.style.fontSize = "4em";
  6.   button.innerHTML = "Play Me! Play Me!"
  7.   button.onclick = function(ev) {
  8.     startWav(SOUND_URL);
  9.   }
  10.   place.appendChild(button);
  11.  
  12.   // cancel dbl-click so we can follow our natural urge to click on the button incessantly
  13.   story.getTiddler(tiddler.title).ondblclick = function() {}
  14.  
  15. }
  16.  
  17. // non-flash sound handling adapted from http://www.zwitserloot.com/files/soundkit/soundcheck.html
  18.  
  19. var embedEl;
  20.  
  21. function startWav(uri) {
  22.         stopWav();
  23.         embedEl = document.createElement("embed");
  24.         embedEl.setAttribute("src", uri);
  25.         embedEl.setAttribute("hidden", true);
  26.         embedEl.setAttribute("autostart", true);
  27.         document.body.appendChild(embedEl);
  28. }
  29.  
  30. function stopWav() {
  31.   if (embedEl) document.body.removeChild(embedEl);
  32.   embedEl = null;
  33. }
  34.  
  35. // To make a new sound, cut and paste into hixie's handy Data URI Kitchen
  36. // http://software.hixie.ch/utilities/cgi/data/data.pl
  37.  
  38. var SOUND_URL="data:audio/x-wav,RIFF%17%1C%00%00WAVEfmt%20%10%00%00%00%01%00%01%00%40%1F%00%00%40 ........................" // snipped

Frame-Busting Gadgets

In the questions after my @media ajax talk, Simon Willison asked about frame busting. If gadgets sit inside iframes, what’s to stop them from busting the frame, i.e. replacing the container with another website. I notice he made a similar comment when OpenSocial came out. If a gadget can cause iGoogle to go away in place of another website, it could for example launch a phishing attack by presenting a mock iGoogle login page (“iGoogle has timed out. Please re-enter your username and password.”).

At a technical level, there is actually nothing to prevent this. There’s no magic in the container. IFrames are about as good as it gets in modern browser for taming the gadgets, but there are still problems with that model, and one of those is the ability for iframes to bust out. I decided to create a demo of this. The gadget is here. You can drop it into iGoogle and see for yourself. Video below:

[kml_flashembed movie="http://picupper.com/video/framebust.swf" height="700" width="400" /]

(Parenthetically, the video is vertical. Why are videos always horizontal on the internets? It’s a computer, not a TV! It should be just as easy to embed an dodecahedron video if that’s what I fancy at the time.)

At this stage, there’s no technical way around this. One possibility would be for the container to do something in the rarely-used onexit() method, which runs just before the new page is loaded. However, as I learned with WebWait, which is also vulnerable to frame-busting, there’s nothing you can really do at that stage, and you can’t determine if the exit is happening because of a frame-busting event as opposed to the user just typing in a new URL. I suppose onexit() could still be used to log info to the server – if you knew all the times the container exits and which gadgets were present at that time, you might find certain (malicious) gadgets were present more times than you’d expect if they were present randomly. Although, I’m not sure about how reliable it is to make an Ajax call in the onexit(). (In this case, though, it would only need to work some of the time.)

In the future, the technical solution is Caja, if it proves to be production-ready. Caja ensures a safe subset of Javascript, so a container like iGoogle would only ever serve a gadget if it was verified to be safe.

Until then, we have to rely on “social” mechanisms. i.e. user comments in the catalogue and manual checking by container staff. A while ago, iGoogle made it harder to add a gadget directly by URL – you basically have to use the catalogue unless you’re a developer. So the catalogue is effectively acting as a whitelist – once found to be malicious, a gadget could be removed. This isn’t simple though – gadgets are dynamic and the catalogue is effectively just a list of URLs. A malicious author could have their gadget added to the catalogue and then change it, so the gadgets need constant review – a highly ineffective process.

I also mentioned in the presentation that “html” type gadgets are more recommended than “url” gadgets. This is partly because with html gadgets, the container at least has some idea of what’s happening inside the gadget – though not entirely, since the code can still reference external scripts and services. With “url” gadgets, what happens is entirely controlled by an external server.

Relative Paths and On-Demand Calls in Gadgets

A problem with the current opensocial gadget spec is that there’s no relative path support. This means you end up hard-coding any references to Javascripts, CSS stylesheets, images, and services which are distributed along with your gadget.

This is not good. For example, you may have a “prod” setup and a “dev” setup. While developing, you will have to hard-code <script> tags and so on to hit the “dev” location, but then for production, you will want them pointing at your scripts in the production location. Likewise, you might want to ship the bundle to someone else for them to host on their own network.

You can deal with it using a pre-compiler tool – you know, using sed-like token substitution to piff in the right location. But a pre-compilation step is a messy compromise – I just want to distribute the code and be done with it. Another solution would be to output the gadget spec from a script, which would let you solve the problem in one of two ways: (a) inline common Javascripts and stylesheets so that you don’t even need to link to them; (b) piff in the right location on the fly. Again, this is not ideal. It introduces a new server-side language and demands the spec server supports it. My aim here is simple distribution of gadgets – as long as you can host a tree of static files, you can serve my gadgets.

The optimal solution right now is far from ideal, but is the best way I know to do it. You use On-Demand Javascript (and on-demand CSS and on-demand <insert resource type here>). First, you work out the gadget spec location, then you manipulate the URL to find the required Javascript, and then you pull it down using DOM manipulation.

I’m using a compressed version of something like this:

javascript
< view plain text >
  1. var gadgetURL = _args()["url"];
  2.     function loadScript(scriptURL, onLoaded) {
  3.       var baseURL=gadgetURL.replace(/[a-zA-Z0-9_]+.xml(?.*)?/,"")
  4.       if (scriptURL.indexOf("http")!=0) scriptURL = baseURL + scriptURL;
  5.       if (debugMode) scriptURL+="?" + (new Date()).getTime();
  6.       var script = document.createElement("script");
  7.       script.src = scriptURL;
  8.       if (onLoaded) script.onload = onLoaded;
  9.       document.body.appendChild(script);
  10.     }

It’s partly based on a mailing list comment from Arne Roomann-Kurrik. Massaging further, I end up with the following boilerplate block which must be cut-and-paste into each gadget to bootstrap reuse:

javascript
< view plain text >
  1. _IG_RegisterOnloadHandler(function() { var script = document.createElement("script"); script.src = _args()["url"].replace(/[a-zA-Z0-9_]+.xml(?.*)?/,"") + "../core.js"; script.onload = initialise; document.body.appendChild(script); });

After dropping this at the bottom of my gadget, I know that (a) ../core.js will be pulled down and executed – it is relative path from wherever the gadget came; (b) initialise() will be called once core.js has been loaded. Each of my gadget’s has an initialise() function which does gadget-specific initialisation. core.js is a common library and also happens to include loadScript() in case I want to pull down further Javascripts, and also loadStylesheet() so I can grab a stylesheet with relative path.

All this is not at all ideal, for several reasons:

  1. For all the benefit of neatly using a separate Javascript file, you have to cut-and-paste boilerplate code into every page! But at least the boilerplate won’t have to be changed often.
  2. Performance will suffer as the script is loaded later than it should be.
  3. It relies on the “url” being passed into the gadget; to my knowledge, this is not something you can definitely assume will happen, i.e. it’s not mandated by the opensocial standard that the gadget receives a parameter called “url” identifying where the gadget came from.

I have requested on the opensocial spec mailing list to provide direct relative path support and the response has been positive, hopefully we’ll see it soon, thus rendering large parts of this post obsolete. The solution looks to be rather neat – injecting a <base> tag to establish a base URL which all relative paths work against.

Cross-Domain Communication with IFrames

An update in the era of HTML5 (May 6, 2011)

This post has been heavily commented and linked to over the years, and continues to receive a ton of traffic, so I should make it clear that much of this is no longer relevant for modern browsers. On the one hand, they have adjusted and tightened up their security policies, making some of the techniques here no longer relevant. On the other hand, they have introduced technologies that make it easier to do cross-domain communication in the first place.

With modern browsers, you can and should be using postMessage for this purpose.

Library support is now available too. All of the following provide an interface to postMessage where it’s available, but if it’s not, fall back to the primordial techniques described in this article:

Now back to the original post … (from March 31, 2008)

This article explains iframe-to-iframe communication, when the iframes come from different domains. That you can do this effectively is only now becoming apparent to the community, and is now used in production by Google, Facebook, and others, and has powerful implications for the future of Ajax, mashups, and widgets/gadgets. I’ve been investigating the technique and working some demos, introduced in the article.

Background: Cross-Domain Communication

Ironic that in this world of mashups and Ajax, it’s not very easy to do both of them together. Ajax applications run in the browser and such applications were never intended to talk to anything but the server from whence they came. So it’s not easy to mash content from multiple sources, when everything must be squeezed through the originating web server. A few hacks have arisen over the years to deal with this, such as On-Demand Javascript, and the most recent one is a hack involving iframes, which I’ll explain in this article. As we’ll see later, the iframe technique is arguably more secure than On-Demand Javascript, and it’s also better places for communication within the browser, i.e. from one iframe to another.

Related to this article is a demo application and a couple of variants.

The first mention I’ve seen of this hack originated on James Burke’s Tagneto blog in June, 2006, though I’m fairly certain it’s been used in some quarters long before that. It’s now used in production by Google in Mapplets. It’s also used in Shindig for widget-container communication. The technique also happens to be the best way to make safe cross-domain calls from the browser directly to a third-party server, which is why it is employed by Facebook’s new Javascript Client Library.

The Demo

First, let’s see what we can do with this hack.

Demo

In this demo, we have a control on the top-level document affecting something in the iframe and vice-versa. This shows you can run communication in both directions with this technique. Typical of the technique, the communication is between two browser-side components from different domains (as opposed to browser-to-server communication, although there is actually server communication involved in making this happen).

The Laws of Physics: What you can do with IFrames

To understand the hack, we need to understand the “laws of physics” as they apply to iframes and domain policies within the browser. Once you appreciate the constraints in place, the pattern itself becomes trivial. This demo was created to explore and illustrate these constraints, and contains some simple code examples.

Definition I: A “window” refers either to an iframe or the top-level window (i.e. the “main” page). In our model, then, we have a tree-like hierarchy of windows.

Law I: Any window in the hierarchy can get a handle to any other window in the hierarchy. It doesn’t matter where they live within the hierarchy or which domain they come from – with the right commands, a window can always refer to any other window. Parent windows are accessed as “parent”, “parent.parent”, etc., or “top” for the top-level. Child windows are accessed as “window.frames[0]” or “window.frames[name]“. Note in this case that the name is not the iframe’s id, but rather the iframe’s name. (This reflects the legacy nature of all this stuff, relating back to ugly late-90s frames and framesets.) Thus, to get a sibling handle, you might use “parent.frames[1]“.

Law II: Windows can only access each others’ internal state if they belong to the same domain. This rather puts a kibosh on the whole cross-domain cross-iframe thing. All this would be so easy if iframe scripts could talk to each other directly, but that would cause all manner of security shenanigans. HTML 5 does define explicit communication between iframes, but until wide adoption, we have to think harder …

Law III: Any window in the hierarchy can set (but not read) any other window’s location/URL, even though (from Law II) browser security policies prevent different-domain iframes from accessing each other’s internal state. Note: Exact details for this law needs further investigation Again, it doesn’t matter which domain it comes from or its position in the hierarchy. It can always get a handle on another window and can always set the window’s URL, e.g. “parent.frames[1].location.href”. This establishes window URLs as the one type of information on the page which is shared across all windows, regardless of the domain they come from. It seems sensible that a parent can change its child windows’ URLs, BUT not vice-versa; how strange that a child window is allowed to alter its parent’s (or uncle’s, sibling’s, etc.) URLs! The only justification I know of is the old technique of escaping the frame trap, where a website, upon loading, ensures it’s not inside a frame by simply setting the top-level URL – if it’s different to itself – to its own URL. This would then cause the page to reload to its own URL. However, that’s a special case and hardly seems worth justifying this much leeway. So I don’t really know why you can do this, but lucky for us, you can!

Law IV: When you change the URL’s fragment identifier (the bit on the end starting with a #, e.g. http://example.com/blah#fragmentID), the page doesn’t reload. This will already be familiar to you if you’re familiar with another Ajax hack, Unique URLs to allow for bookmarkability and page history. Normally, changing a document’s “href” property causes it to reload, but if you only change the fragment identifier, it doesn’t. You can use this knowledge to change the URL symbolically – in a manner which allows a script to inspect it and make use of it – without causing any noticeable change to the page content.

Exploiting the Laws of Physics for Cross-Domain Fun and Profit – The Cross-Domain Hack (URL Polling version)

The laws above are all we need to get cross-domain communication happening. The technique is simply this, assuming Window A wants to control Window B:

  • Window A changes Window B’s fragment identifier.
  • Window B is polling the fragment identifier and notices the change, and updates itself according to the fragment identifier.

Ta-da!!! That’s the whole thing, in its glorious entirety. Of course, you had to know the laws of physics in order to understand why all this works. It simply relies on the fact that both Window A and Window B have one common piece of state – the URL – and the fact that we can change the URL unintrusively by manipulating only the fragment identifier. For example, in the demo, the iframe’s URL changes to http://ajaxpatterns.org/crossframe/#orange and once the iframe script notices it, it updates the colour.

A few observations:

  • This works in either direction. Parent to child, child to parent. As the demo illustrates.
  • It requires co-operation from both parties; it’s not some magic way to bypass browser security mechanisms. Once Window A changes Window B’s fragment identifier, it’s up to Window B to act on the change; and it’s up to Window B to be polling the fragment identifier in the first place.
  • Polling the fragment identifier happens to be exactly the same technique used in the Unique URLs pattern.

There are a couple of downsides: (a) Polling slows down the whole application; (b) Polling always involves some lag time (and there’s always a trade-off a and b – the faster the response, the more cycles you application uses up); (c) The URL visibly changes (assuming you want to manipulate the top-level window). We’ll now consider a second technique that addresses these (albeit in a way that introduces a different downside).

The Cross-Domain Hack (Marathon version)

Here’s a variant which no longer involves polling or changing any URLs. I learned of it from Juliene Le Comte’s blog, and he’s even packaged it as a library.

Looking back at Law II: “Windows can only access each others’ internal state if they belong to the same domain”. At the time, I made this sound like a bad thing, but as David Brent likes to say, “So, you know, every cloud …”. The law is bad if you state it as “cross-domain iframes can’t play with each others’ toys” (paraphrasing the informal version of Demeter’s law). But it’s good if you spin it as “well, at least same-domain iframes can play with each others’ toys”. That’s what we’re going to exploit here.

As for the demo, the functionality is the same, but since this one involves spawning iframes, I’ve left them intact, and made them visible, for your viewing delight. Normally, of course, they’d be invisible, and the application would look exactly the same as the previous demo.

Here’s how this technique works:

  • Every time Window A wants to call Window B, it spawns a child iframe, “Window B2″ in the same domain as Window B. The URL includes the command being issued (as a CGI parameter, fragment identifier, or any other URL pattern which will be recognised by the destination script).
  • When window B2 starts up, its Javascript inspects the URL, gets a handle on Window B, and updates Window B according to the URL (e.g. a CGI parameter).
  • Window B2 destroys itself in a puff of self-gratified logic.

So in this case, we create a new, short-lived, iframe for every message being passed. Because the iframe comes from the same domain as the window we’re trying to update, it’s allowed to change the window’s internal state. It’s only useful to us on startup, because after that we can no longer communicate with it (apart from by the previous fragment identifier trick, but we could do that directly on the original window).

Window B2 is sometimes called a proxy because it accepts commands from Window A and passes them to Window B. I like to think of it as Pheidippides of fame; it passes on a message and then undergoes a noble expiration. Its whole mission in life is to deliver that one message.

This technique comes with its own downside too. Quite obviously, the downside is that you must create a new iframe for every call, which requires a trip to the server. However, with caching in place, that could be avoided, since everything that must happen will happen inside the browser. So it would simply be the processing expense of creating and deleting an iframe element. Note that the previous variant never changed the DOM structure or invoked the server.

Also, note that in either versions of the hack, there is the awkward matter of having to express the request in string form, since in either pattern, you are required to embed the request on the window URL. There is an inspired extension of this hack that also has some untapped promise in this area. It involves setting up a subdomain and updating its DNS to point to a third-party website. When combined with the old document.domain hack, you end up with a situation where your iframe can communicate with a cross-domain iframe, without relying on iframe. (The technique described in the article is about browser-to-server communication, but I believe this iframe-to-iframe is possible too.)

A Third Hack Emerges: Window Size Monitoring

A newer third hack by Piers Lawson is based around the porous nature of window sizes and the use of window.resize(). Fragment IDs are used like in the first technique here, but instead of polling, window resize events are used to cause a more direct trigger.

Applications

Cross-Domain IFrame-to-IFrame Calls … and Widgets/Gadgets

In the world of mashups, iframes are a straightforward way to syndicate content from one place to another. The problem, though, is limited interaction between iframes; in pure form, you end up with a few mini web browsers on a single page. It gets better when the iframes can communicate with each other. For example, you can imagine having iGoogle open, with a contacts widget and a map widget. Clicking on a contact, the map widget notices and focuses on the contact’s location. This is possible via Gadget-To-Gadget communication, a form of publish-subscribe which works on the iframe hack described here. And speaking of maps, check out Google Mapplets, which are a special form of gadget that work on Google Maps, and also rely on this technique.

In terms of gadgets, another application is communication between a gadget and its container, and this is something I’ve been looking at wrt Shindig. For example, there is a dynamic-height feature gadgets can declare. This gives the gadget developer an API to say “I’ve updated, now please change my height”. Well, an iframe can’t change its own height; it must tell its parent to do that. And since the gadget lives in an iframe, on a different domain as the container (e.g. iGoogle), this requires a cross-domain, cross-iframe, message. And so, it uses this technique (“rpc” – remote procedural call – in shindig terminology) to pass a message to the container.

Cross-Domain Browser-to-Server Calls

The best known technique for calls from the browser to an third-party server is On-Demand Javascript, aka Javascript APIs aka JSON/JSONP. This was obscure in 2005, with Delicious being the best example. Now, it’s big time in Web 2.0 API land, and Yahoo! has exposed almost all of its APIs this way, and Google also provides data such as RSS content via JSON.

It works by spawning a new script element programmatically, an element pointing to an external Javascript URL. Since there’s no restriction on third-party Javascript running, the browser will faithfully fetch and execute the script, and so the script will typically be written to “return” by updating variable and/or calling an event handler.

There are two major security issues with On-Demand Javascript. Firstly, you have to trust the API provider (e.g. Yahoo!) a lot because you are letting them run a script on your own web page. And there’s no way to sanitise it, due to the script tag mechanism involved. If they are malicious or downright stoopid, your users may end up running an evil script which could ask for their password, send their data somewhere, or destroy their data altogether. That’s because whatever your web app can do, so can the third party’s script, even if you’re only trying to get a simple value back. The mechanism forces you to hand over the keys to the Ferrari when all you want is a new bumper sticker. Secondly, what’s to stop other websites also making use of the external Javascript? If your own site can embed the script to call a third-party JS API, so too can a malicious fourth-party. This is fine for public, read-only, data, but what if you’re relying on the user’s cookies to make privileged calls to the third-party? Then the fourth-party’s web app will be just as capable of issuing calls from the browser to the third-party, and they might well be more evil calls than you’re making, e.g. “transferFunds()” instead of “getBankBalance()”. The moral is: Javascript APIs can only be used for serving public data.

Whoa!!! Public data only? That’s a tragic restriction on our cross-domain API! For mashups to be truly useful, it must be personal. We’ll increasingly have OAuth-style APIs where users will tell Site X that Site Y is allowed to read its data. But how can that work in the browser? How can Site Y expose its data so that it’s usable from the browser, but only when Site X is running? It can’t work with On-Demand Javascript. Site Y could try reading the referrer headers to see where the call is coming from, but anyone could write a command-line client with fake headers.

In fact, the answer is to use the iframe hack described in this article. As I mentioned earlier, this is how Facebook gets the job done, with what is essentially the same “power of attorney” delegation model as OAuth (BTW thanks to my colleague Jeremy Ruston for the “power of attorney” OAuth analogy – albeit it was stated in a slightly different context from OAuth).

I haven’t looked too much into the mechanism involved with the Facebook API, but it looks like it’s essentially using a variant of the Marathon technique. From memory, there’s an ever-present invisible facebook.com iframe. Each time your web app make a Facebook call, the Facebook JS library spawns a new “proxy” iframe, which passes the message on to its same-domain ever-present frame, which makes a bog-standard XHR call to Facebook. So now we’re making an XHR call to another domain, which we can get away with because it’s coming from a separate iframe. Once the XHR call returns, I think the message is returned to your application (this happens via another same-domain iframe you must host on your server, though I think that’s unnecessary) and the proxy iframe disappears.

Note that all Facebook ever exposes is a standard web service that relies on the user being logged into Facebook – there’s no Javascript involved. The user must be logged in and must have given permission for the application to access its Facebook details. Effectively, the user is allowing a particular website URL to make Facebook calls, since the application developer must register the URL. If you look back at the iframe algorithms I described earlier, you’ll see that it’s straightforward for Facebook to ensure that only this application (and any other application the user trusts) can access the data. The Facebook.com iframe (whose behaviour is controlled by IFrame and can’t be tampered) simply has to inspect the URL of the parent window and pass it to the server as part of the XHR call. The server can then check that the logged-in user has authorised this application, using the URL to identify it.

As for the first concern of cross-domain Javascript – having to trust the third-party API provider – I believe the iframe technique overcomes this concern too. Facebook.com never gets to run arbitrary Javascript on your server. Of course, you have to trust the Facebook library and the Facebook-provided iframe you’re required to host on your domain, but those could be audited prior to installation. All those things are set up to do is call callback methods inside your top-level application; you could inspect the library and ensure that’s all that will ever happen.

Thus, cross-domain iframe-based communication solves both problems which have plagued On-Demand Javascript. It is slightly more complicated, however.

Conclusions

Yes, this is a somewhat complicated technique. Actually understanding the problem it solves is really the hard part! Once you understand that, and once you understand those laws of physics, the trick is actually quite straightforward (either version of it).

The technique will be critical for gadget containers such as Shindig. As OAuth takes off, we’ll also see the technique used a lot more in mainstream applications and APIs.

With HTML 5, cross-frame messaging will render the hack unnecessary for iframe-to-iframe communication. Indeed, the aforementioned Cross-Domain library uses that technique already for Opera, in a fortuitous twist of fate since Opera doesn’t actually support everything this hack requires. However, the notion of using iframes for cross-domain calls will still be present, no matter how the windows talk to each other.

BlingText and Banner

Ajax, AjaxPatterns

As foretweeted last week, I created a little Ajax app called BlingText.

As you can see, it takes a message and provides some ASCII renderings. In particular, it includes a port of the old UNIX/C Banner utility.

If I do more work on it, the main improvements will be:

  • Options. Let the user specify, for each transformation, parameters such as the fill character (“*”) and amount of spacing.
  • Better OO (internal change). Each of the transformations is at present a terse strategy object, which is good. However, there’s no inheritance going on, so it could be better.