An update in the era of HTML5 (May 6, 2011)
This post has been heavily commented and linked to over the years, and continues to receive a ton of traffic, so I should make it clear that much of this is no longer relevant for modern browsers. On the one hand, they have adjusted and tightened up their security policies, making some of the techniques here no longer relevant. On the other hand, they have introduced technologies that make it easier to do cross-domain communication in the first place.
With modern browsers, you can and should be using postMessage for this purpose.
Library support is now available too. All of the following provide an interface to postMessage where it’s available, but if it’s not, fall back to the primordial techniques described in this article:
Now back to the original post … (from March 31, 2008)
This article explains iframe-to-iframe communication, when the iframes come from different domains. That you can do this effectively is only now becoming apparent to the community, and is now used in production by Google, Facebook, and others, and has powerful implications for the future of Ajax, mashups, and widgets/gadgets. I’ve been investigating the technique and working some demos, introduced in the article.
Background: Cross-Domain Communication
Related to this article is a demo application and a couple of variants.
First, let’s see what we can do with this hack.
In this demo, we have a control on the top-level document affecting something in the iframe and vice-versa. This shows you can run communication in both directions with this technique. Typical of the technique, the communication is between two browser-side components from different domains (as opposed to browser-to-server communication, although there is actually server communication involved in making this happen).
The Laws of Physics: What you can do with IFrames
To understand the hack, we need to understand the “laws of physics” as they apply to iframes and domain policies within the browser. Once you appreciate the constraints in place, the pattern itself becomes trivial. This demo was created to explore and illustrate these constraints, and contains some simple code examples.
Definition I: A “window” refers either to an iframe or the top-level window (i.e. the “main” page). In our model, then, we have a tree-like hierarchy of windows.
Law I: Any window in the hierarchy can get a handle to any other window in the hierarchy. It doesn’t matter where they live within the hierarchy or which domain they come from – with the right commands, a window can always refer to any other window. Parent windows are accessed as “parent”, “parent.parent”, etc., or “top” for the top-level. Child windows are accessed as “window.frames” or “window.frames[name]“. Note in this case that the name is not the iframe’s id, but rather the iframe’s name. (This reflects the legacy nature of all this stuff, relating back to ugly late-90s frames and framesets.) Thus, to get a sibling handle, you might use “parent.frames“.
Law II: Windows can only access each others’ internal state if they belong to the same domain. This rather puts a kibosh on the whole cross-domain cross-iframe thing. All this would be so easy if iframe scripts could talk to each other directly, but that would cause all manner of security shenanigans. HTML 5 does define explicit communication between iframes, but until wide adoption, we have to think harder …
Law III: Any window in the hierarchy can set (but not read) any other window’s location/URL, even though (from Law II) browser security policies prevent different-domain iframes from accessing each other’s internal state. Note: Exact details for this law needs further investigation Again, it doesn’t matter which domain it comes from or its position in the hierarchy. It can always get a handle on another window and can always set the window’s URL, e.g. “parent.frames.location.href”. This establishes window URLs as the one type of information on the page which is shared across all windows, regardless of the domain they come from. It seems sensible that a parent can change its child windows’ URLs, BUT not vice-versa; how strange that a child window is allowed to alter its parent’s (or uncle’s, sibling’s, etc.) URLs! The only justification I know of is the old technique of escaping the frame trap, where a website, upon loading, ensures it’s not inside a frame by simply setting the top-level URL – if it’s different to itself – to its own URL. This would then cause the page to reload to its own URL. However, that’s a special case and hardly seems worth justifying this much leeway. So I don’t really know why you can do this, but lucky for us, you can!
Law IV: When you change the URL’s fragment identifier (the bit on the end starting with a #, e.g. http://example.com/blah#fragmentID), the page doesn’t reload. This will already be familiar to you if you’re familiar with another Ajax hack, Unique URLs to allow for bookmarkability and page history. Normally, changing a document’s “href” property causes it to reload, but if you only change the fragment identifier, it doesn’t. You can use this knowledge to change the URL symbolically – in a manner which allows a script to inspect it and make use of it – without causing any noticeable change to the page content.
Exploiting the Laws of Physics for Cross-Domain Fun and Profit – The Cross-Domain Hack (URL Polling version)
The laws above are all we need to get cross-domain communication happening. The technique is simply this, assuming Window A wants to control Window B:
- Window A changes Window B’s fragment identifier.
- Window B is polling the fragment identifier and notices the change, and updates itself according to the fragment identifier.
Ta-da!!! That’s the whole thing, in its glorious entirety. Of course, you had to know the laws of physics in order to understand why all this works. It simply relies on the fact that both Window A and Window B have one common piece of state – the URL – and the fact that we can change the URL unintrusively by manipulating only the fragment identifier. For example, in the demo, the iframe’s URL changes to http://ajaxpatterns.org/crossframe/#orange and once the iframe script notices it, it updates the colour.
A few observations:
- This works in either direction. Parent to child, child to parent. As the demo illustrates.
- It requires co-operation from both parties; it’s not some magic way to bypass browser security mechanisms. Once Window A changes Window B’s fragment identifier, it’s up to Window B to act on the change; and it’s up to Window B to be polling the fragment identifier in the first place.
- Polling the fragment identifier happens to be exactly the same technique used in the Unique URLs pattern.
There are a couple of downsides: (a) Polling slows down the whole application; (b) Polling always involves some lag time (and there’s always a trade-off a and b – the faster the response, the more cycles you application uses up); (c) The URL visibly changes (assuming you want to manipulate the top-level window). We’ll now consider a second technique that addresses these (albeit in a way that introduces a different downside).
The Cross-Domain Hack (Marathon version)
Here’s a variant which no longer involves polling or changing any URLs. I learned of it from Juliene Le Comte’s blog, and he’s even packaged it as a library.
Looking back at Law II: “Windows can only access each others’ internal state if they belong to the same domain”. At the time, I made this sound like a bad thing, but as David Brent likes to say, “So, you know, every cloud …”. The law is bad if you state it as “cross-domain iframes can’t play with each others’ toys” (paraphrasing the informal version of Demeter’s law). But it’s good if you spin it as “well, at least same-domain iframes can play with each others’ toys”. That’s what we’re going to exploit here.
As for the demo, the functionality is the same, but since this one involves spawning iframes, I’ve left them intact, and made them visible, for your viewing delight. Normally, of course, they’d be invisible, and the application would look exactly the same as the previous demo.
Here’s how this technique works:
- Every time Window A wants to call Window B, it spawns a child iframe, “Window B2″ in the same domain as Window B. The URL includes the command being issued (as a CGI parameter, fragment identifier, or any other URL pattern which will be recognised by the destination script).
- Window B2 destroys itself in a puff of self-gratified logic.
So in this case, we create a new, short-lived, iframe for every message being passed. Because the iframe comes from the same domain as the window we’re trying to update, it’s allowed to change the window’s internal state. It’s only useful to us on startup, because after that we can no longer communicate with it (apart from by the previous fragment identifier trick, but we could do that directly on the original window).
Window B2 is sometimes called a proxy because it accepts commands from Window A and passes them to Window B. I like to think of it as Pheidippides of fame; it passes on a message and then undergoes a noble expiration. Its whole mission in life is to deliver that one message.
This technique comes with its own downside too. Quite obviously, the downside is that you must create a new iframe for every call, which requires a trip to the server. However, with caching in place, that could be avoided, since everything that must happen will happen inside the browser. So it would simply be the processing expense of creating and deleting an iframe element. Note that the previous variant never changed the DOM structure or invoked the server.
Also, note that in either versions of the hack, there is the awkward matter of having to express the request in string form, since in either pattern, you are required to embed the request on the window URL. There is an inspired extension of this hack that also has some untapped promise in this area. It involves setting up a subdomain and updating its DNS to point to a third-party website. When combined with the old document.domain hack, you end up with a situation where your iframe can communicate with a cross-domain iframe, without relying on iframe. (The technique described in the article is about browser-to-server communication, but I believe this iframe-to-iframe is possible too.)
A Third Hack Emerges: Window Size Monitoring
A newer third hack by Piers Lawson is based around the porous nature of window sizes and the use of window.resize(). Fragment IDs are used like in the first technique here, but instead of polling, window resize events are used to cause a more direct trigger.
Cross-Domain IFrame-to-IFrame Calls … and Widgets/Gadgets
In the world of mashups, iframes are a straightforward way to syndicate content from one place to another. The problem, though, is limited interaction between iframes; in pure form, you end up with a few mini web browsers on a single page. It gets better when the iframes can communicate with each other. For example, you can imagine having iGoogle open, with a contacts widget and a map widget. Clicking on a contact, the map widget notices and focuses on the contact’s location. This is possible via Gadget-To-Gadget communication, a form of publish-subscribe which works on the iframe hack described here. And speaking of maps, check out Google Mapplets, which are a special form of gadget that work on Google Maps, and also rely on this technique.
In terms of gadgets, another application is communication between a gadget and its container, and this is something I’ve been looking at wrt Shindig. For example, there is a dynamic-height feature gadgets can declare. This gives the gadget developer an API to say “I’ve updated, now please change my height”. Well, an iframe can’t change its own height; it must tell its parent to do that. And since the gadget lives in an iframe, on a different domain as the container (e.g. iGoogle), this requires a cross-domain, cross-iframe, message. And so, it uses this technique (“rpc” – remote procedural call – in shindig terminology) to pass a message to the container.
Cross-Domain Browser-to-Server Calls
In fact, the answer is to use the iframe hack described in this article. As I mentioned earlier, this is how Facebook gets the job done, with what is essentially the same “power of attorney” delegation model as OAuth (BTW thanks to my colleague Jeremy Ruston for the “power of attorney” OAuth analogy – albeit it was stated in a slightly different context from OAuth).
I haven’t looked too much into the mechanism involved with the Facebook API, but it looks like it’s essentially using a variant of the Marathon technique. From memory, there’s an ever-present invisible facebook.com iframe. Each time your web app make a Facebook call, the Facebook JS library spawns a new “proxy” iframe, which passes the message on to its same-domain ever-present frame, which makes a bog-standard XHR call to Facebook. So now we’re making an XHR call to another domain, which we can get away with because it’s coming from a separate iframe. Once the XHR call returns, I think the message is returned to your application (this happens via another same-domain iframe you must host on your server, though I think that’s unnecessary) and the proxy iframe disappears.
Yes, this is a somewhat complicated technique. Actually understanding the problem it solves is really the hard part! Once you understand that, and once you understand those laws of physics, the trick is actually quite straightforward (either version of it).
The technique will be critical for gadget containers such as Shindig. As OAuth takes off, we’ll also see the technique used a lot more in mainstream applications and APIs.
With HTML 5, cross-frame messaging will render the hack unnecessary for iframe-to-iframe communication. Indeed, the aforementioned Cross-Domain library uses that technique already for Opera, in a fortuitous twist of fate since Opera doesn’t actually support everything this hack requires. However, the notion of using iframes for cross-domain calls will still be present, no matter how the windows talk to each other.