Operation Zero-Shelf

“My shelves are empty. The half-dozen Billy Bookcases I bought from Ikea are now little more than scrap. I have burned my books. A bonfire of ideas and ideals.” Terence Eden

I’ve entertained the zero-shelf fantasy since I first read e-books and sync’d news on the trusty Handera 330. I stil have it:


At the time, the content wasn’t around and the screen was a bit small to be practical anyway. When the iRex came around, I thought this might be the time. But of course, content remained an issue. But now the time is here. iPad is fantastic for reading and Kindle has all the content. And just last night, Amazon popped up with an announcement that Kindle Touch is here in the UK. I’ve resisted Kindles up till now. For all their benefits, they look like a bad 1960s sci-fi imagining what life will be like in the year 1979. The keyboard in particular seems completely out-of-place for a device that is primarily about reading. And it’s impossible at this point to hold a screen which you can’t touch. So the Kindle Touch changes a lot of that and with its battery benefits and (especially after a week of glorious March weather here) working in the sun is a killer app tablets won’t beat for some years.

So it’s time to make it happen. This week, I’ll scan all my books with Delicious Monster or Bookshelved to keep a perpetual record of what I once owned. I’ll allow myself to keep 10 books of sentimental or long-lasting value and another 10 which I haven’t (fully) read. These lucky 20 will be marked with post-it notes. The rest can stay on the shelf for three months and if I feel the need to keep one of them, I will swap those post-it notes around. And on July 1, 2012, those without post-it notes will be shown no mercy! Well, some mercy … they can go to a suitable charity.

Doing It Right, You’re Doing It Wrong

Sometimes I will ask a question, say on StackOverflow and such-like, or maybe even IRL, and then be struck down as doing something wrong. The question’s based on an evil assumption, so don’t encourage him by answering it.

I get it, the importance of informing people The Right Way, especially if this question-asking friend of mine cough hadn’t provided justification for the sly manoeuvre he or she is trying to pull off.

But there’s a twilight zone between outright cowboyery and The Right Way. Fail fast! The code that’s shipped is infinitely more useful to the world than the code that sits on your hard drive, irrespective of the respective quality levels. And with that in mind, there’s a legitimate need for a theory of pragmatic programming practices. Practices which let you incur some technical debt while you get on with the job of launching an experiment and seeing if people find it useful. So instead of fashioning design patterns and guidelines around the best theoretical design, there should be more attention on ideas which explicitly let you cut a few corners, but cut the right corners!

“Close a blind eye and this sht goes underground. That’s when all hell breaks lose.”*
- Hardened policy wonk in a movie which hasn’t yet been made

A concrete example. In the more liberated languages, such as JavaScript and Ruby, there’s decent object-oriented support, but it’s not a requirement. You can still have global variables and functions sitting out there “in fresh air”. And in Ruby, you can scope variables to a single file. For most big systems, The Right Way is indeed object-oriented for all the reasons OO is great. But for a system you’re cobbling together organically, it’s very possibly overkill. You risk analysis paralysis and end up writing extra infrastructure which may not be justified. On the other hand, globals everywhere may lead you down a dangerous path too early on. So there needs to be more of a science around finding the sweet spots in situations like this. Helping you walk the windy trajectory from Hello World to Bells And Whistles.

Cut Every Corner as featured in Simpsoncalifragilisticexpiala(Annoyed Grunt)cious

Coding with the Janus Vim Distro

I’m back on Vim, let me explain why and what my setup is.

Janus Vim Distro

The background is I’ve been happy, bordering on ecstasy, to use Vim for small projects, but found file management difficult for larger projects. I installed the incredibly useful Janus Vim distro a while back. There’s a one-liner curl script provided to install Janus.

The first useful thing about Janus is it ships with a selection of vital coding plugins. The most useful so far have been: NERDTree for file management; Ctrl-P for locating files (using IntelliJ-like matching, e.g. “PropertiesController.rb” could be found with “PCo” or “PrC”; “stars.html” could be found with “st.h”, and with its Most Recently Used algorithm, gives the same sense of magic); Ack for quick ack/grep-like searches; Syntastic, which automatically red-flags syntax errors when you open or save a file (ideally I’d like to run automatically every few seconds).

Janus also ships with a bunch of useful macros, as well as some handy customisations for those default plugins it ships with. All this is nicely documented in :help janus.

Another nice thing about Janus is it sets up a sane directory structure for you. If you’re not familiar, Vim’s default config distributes packages across role-based directories (like the way Linux distributes a package into /var/log, /etc, etc). This makes it difficult to add and remove a single package. Pathogen fixes this by letting you have a separate directory for each directory, and Janus ships with a Pathogen setup by default. Fortunately, most Vim packages are re-distributed in Pathogen form somewhere on the GitHub. And Janus provides a ~/.janus directory where you can drop them in. You can also customise your vim setup with ~/.vimrc.before and ~/.vimrc.after.

Command-Line/Terminal Vim (not MacVim)

Although the Janus README suggests using MacVim, I found it to be really slow switching tabs, at least while running with the Janus distro. So I’ve reverted back to regular shell Vim (which I installed via brew install vim). This is what I’m more used to anyway, as it has the advantage of fitting in with my regular workflow — I can just bring it up any time on the command-line — and it means I can use it as part of a multi-tab iTerm environment. So I can just cmd-left and cmd-right to drop into a running bash session. What I still need to work out is how to use the mouse with Vim running in iTerm2 (:set mouse=a isn’t working for some reason), which would give back the main benefit of MacVim.

Update: Yes, mouse now works in terminal mode! Now I can click on tabs and the expand/collapse/open items by clicking on the NERDTree. It just needed ttym to be set in the end; I’ve verified this works for the default TERM setting of “linux” (I think it’s default anyway). All you need is the following settings in your vim config (~/.vimrc.after in the case of Janus distro). I’ve only tested this in iTerm2.

set mouse=a
set ttym=xterm2

NERDTreeTabs

When I tried Janus some months ago, one thing bugged me. It was the main reason I scurried off to switch to RubyMine. That thing was NERDTree, the file hierarchy system. So close to being useful, falls stunningly short at the last mile.

What’s missing from NERDTree? It just doesn’t behave how one expects or wants a file tree to behave. You open a new tab, and NERDTree is gone! Imagine if web browsers removed the tab bar every time you open a new tab! Unnatural, unwanted. Yes, you can set up auto-commands to make it be there when you open a new tab, but even then, you end up with a different instance of the tree each time. So what got me back on Vim was finding this plugin:

NERDTreeTabs

Simply put, it gives the illusion that you’re using the same tree everywhere. So you switch tabs (or open a new tab) and the file tree is exactly the same as in the other tab. Really, a file hierarchy is essential for a large project, even if I use Ctrl-P most of the time to switch files. So this discovery was good enough to get me back on Vim. I mapped it to ,n (, is my leader) with map <Leader>n <plug>NERDTreeTabsToggle<CR>.

A couple of caveats. First, there’s still no way to keep the file tree in sync with the file you’re working on. It would be cool to have NERDTree highlight the current file, for example, like the kind of sync behaviour most IDEs provide. Second, I haven’t been able to open this automatically in Janus. So I have to manually close the NERDTree if it’s open. (Somehow, NERDTreeClose and NERDTreeTabsToggle have no effect in .vimrc.after; I’m not sure how to customise Janus here to fix this.)

Exuberant Tags

It’s also worth using Exuberant ctags for navigation. I installed it with brew install ctags. This lets you ctrl-] on a word to open up the corresponding file, and is also required for some of the default Janus plugins.

Wish Me A Pony

There are loads of Vi emulators out there now. Separate ones for Eclipse, IntelliJ, Cloud9, and Sublime. I’ve never had much luck with them. One of the great benefits of Vim and command-line in general is the muscle memory. Everything is supremely predictable and it just flows. My experience with these emulators is I’m always second-guessing what happens if I type anything. It’s not always clear whether I’m in command-line mode or not. And, they often require a lot of customisation to be usable.

My first wish is that someone would build a text editor/IDE with actual Vim inside it, not an emulator. It would ship with appropriate plugins to talk to its host IDE environment.

My second wish is that someone would use Chrome’s Native Client to build a Vim textarea plugin. Imagine if you could just call $(‘textarea’).vim() and every textarea becomes a Vim control. And not a Vim-in-JavaScript component. But — with the magic of Native Client — an actual Vim component. With such a component, people could build extensions for Vim enthusiasts to use Vim everywhere and cloud IDEs like Cloud9 could support the real-deal Vim.

Someone Different: Encouraging Twitter Serendipity

I made a very raw cut of an idea that’s been percolating for a while. The idea is to randomly follow a few people for a month or so, then rotate to a few different people, and so on. This would let you immerse yourself in a community for a little while, with the aim of busting out of the filter bubble, learning something new, and maybe making a new friend or two.

Demo: You can try a proof-of-concept here (don’t be put off by the permissions Twitter asks for; it won’t tweet on your behalf, the permission is unfortunately necessary to manipulate lists). It’s raw, but will create a real list for you and set its membership. It works now by simply switching to one of several manually-curated lists, which I just rotate when you visit the app (/pages/welcome). If there’s interest, the plan is (a) to make the rotation happen automatically (b) generate the lists from something like Klout or PeerIndex, to ensure you’re following people in the same community.

Right now, the “someone different” people live only in your “someone-different” list, but I’m inclined to follow them in the main stream. As long as they stay in the “someone-different” list, it’s still easy to unfollow them from the main stream. And if you find you want to keep following someone, you just remove them from the “someone-different” list. (It’s all very ghetto here; I’m basically using the twitter list in lieu of maintaining a database.)

I recently noticed @jobsworth was also looking for something like this. His ideas are certainly not achieved in this v0.0.1, but hopefully it will get people thinking about what’s possible. Please let me know where you’d like to see this go.

The app’s built in Rails using the very nice twitter-login gem and hosted on heroku.

More RSS Client Optimizations: Preventing Re-Fetch

Background: Has the Feed Changed?

I previously mentioned some work I did to cut down processing and IO on an RSS client. Yesterday, I was able to continue this effort with some more enhancements geared around checking if the feed has changed. These changes are not just important for my server’s performance, but also for being a good internet citizen and not hammering others’ machines with gratuitous requests. Note everything in this article will be basic hygiene for anyone whose written any kind of high-scale bot, but documenting here as it was useful learning to me.

Normally, a fetch requires the client to compare the incoming feed against what has been stored. This requires lookup on the database and a comparison process. It’s read-only, so not hugely expensive, but does require reading a lot — all items in the feed — and at frequent intervals.

All this comparison effort would be unnecessary if we could guarantee the feed hasn’t changed since the last fetch. And of course, most of the time, it won’t have changed. If we’re fetching feeds hourly, and the feed changes on average once a week, then we can theoretically skip the whole comparison 99.4% of the time!

So how can we check if the feed has changed?

Feed Hash

The brute-force way to check if the feed has changed is to compare the feed content with the one we received last time. We could store the incoming feed in a file, and if it’s the same as the one we just sucked down, we can safely next it.

Storing a jillion feed files is expensive and unnecessary. (Though some people might temporarily store them if they’ve separated the fetching from the comparison, to prevent blockages, which I haven’t done here). If all we need the files for is a comparison, we can instead store a hash. With a decent hash, the chance of a false positive is extremely low and the severity in this context also extremely low.

So the feed now has a new hash field.

  1. incoming_feed = fetch_feed(feed_record.url)
  2. incoming_hash = Digest::MD5.hexdigest(incoming_feed.body)
  3. return if incoming_hash == feed_record.hash # Files match, no comparison necessary
  4.  
  5. feed_record.title = incoming_feed.title
  6. feed_record.hash = incoming_hash # Save the new hash for next time
  7. # ... Keep processing the feed. Compare each item, etc.

HTTP if-not-modified-since

The HTTP protocol provides its own support for this kind of thing, via the if-not-modified-since request header. So we should send this header, and we can then expect a 304 response in the likely event no change has happened. This will save transferring the actual file as well as bypassing the hash check above. (However, since this is not at all supported everywhere, we still do need the above check as an extra precaution.)

  1. req = Net::HTTP::Get.new(feed_record.url)
  2. req.add_field("If-Modified-Since", last_fetched_at.rfc2822) if last_fetched_at
  3. ...
  4. res = Net::HTTP.new(...)
  5. return if res.code=='304' # We don't even need to compare hashes

ETag

Another HTTPism is ETag, a value that, like our hash, is guaranteed to change if the feed content changes. So to be extra-sure we’re not re-processing the same feed, and hopefully not even fetching the whole feed, we can save the ETag and include it in each request. It works like if-not-modified-since; if the server is still serving the same ETag, it will respond with an empty 304.

  1. req.add_field("If-None-Match", etag) if etag
  2. ...
  3. # Again, we return if res.code=='304'
  4. feed_record.etag = incoming_feed.etag # Save it for next time

For the record, about half of the feeds I’ve tested — mostly from fairly popular sources, many of them commercial — include ETags. And of those, at least some of them change the ETag unnecessarily often, which renders it useless in those cases (actually worse than useless, since it consumes unnecessary resources). Given that level of support, I’m not actually convinced it adds much value over just using if-not-modified-since, but I’ll leave it in for now. I’m sure managers of those servers which do support it would prefer it be used.

Abstraction Begets Fragmentation

Christian Heilmann raises the ugly issue of offline storage. LocalStorage’s synchronous nature makes it slow, WebSQL is deprecated, and IndexedDB’s API induces headaches as well as being unsupported on many browsers. He asks whether the LocalStorage standard should be improved.

One of the predictable responses is “just use a wrapper library”. We could use a fancy wrapper library that gives us the key-value simplicity of LocalStorage, but with an asynchronous API, and backed by those more beastly SQL-based solutions if they should exist on the device.

Now “a wrapper library” has all the usual concerns of abstractions. Does the common-denominator stance remove useful functionality from the underlying APIs? Does it separate the programmer so much from the bare metal that they can’t get their head around performance issues? And so on. All true. But on the web, where open-source is so heavily used, wrapper libraries have an additional cost: Fragmentation.

So we end up with a dozen storage libraries on GitHub. BackStorage, SpineStorage, OrthopedicStorage, you name it. That’s great, now we can use a nice API that works everywhere. Fast forward, and every programmer and their canine is writing offline single-page apps, so now we need libraries to cache, spool, and throttle messaging back to the server. Those libraries rely on underlying storage. Do they talk straight to the raw storage APIs? But then they’ll spend all their time worrying about those APIs. So maybe they reuse one of those storage libraries. One of them uses BackStorage, another one uses SpineStorage, and a couple more use OrthopedicStorage. Yes, a whole plugin ecosystem gathers around each storage solution.

So which are you? Do you consider yourself a BackStorage man/woman? Card-carrying SpineStorage-head? Certified OrthopedicStorage practitioner? You’ll need to pick one, because each is its own community with its own set of plugins and conferences. Not to mention the learning curve involved in adapting to something new. And the conflicts that arise if you try to use two wrappers in the same app. Wait, did SpineStorage.Login just stomp on that username I was trying to retain with BackStorage.Cache?

Choice is good and evolving faster than standards is good too. Which is why we do benefit enormously from wrapper libraries. But the cost of fragmentation is high. It can be justified if you’re reaping the benefits of magic conferred by a highly opinionated framework. But if it’s just there to cover up for a standard with an unnecessarily confusing API, I’d rather the browsers work on simplifying said API. Because I don’t want to identify myself as a developer skilled in the arts of BackStorage/SpineStorage/OrthopedicStorage.

I’d rather just be a “web programmer”.

The Boot2Gecko Developer Experience: An Interview with Mozilla’s Jonathan Nightingale

A surprise bonus of last week’s MWC trip was the coincidental launch of Boot2Gecko, Mozilla’s operating system based on Firefox (Gecko being Firefox’s underlying layout engine and B2G being the unofficial brand name for now). They made a joint announcement with Spain’s Telefonica (O2 and GiffGaff’s parent company) with the expectation to release a commercial offering this year (no commitments yet though).

Continuing my “a conference is worth more than 140 characters” resolution, I had a chance to sit down with Mozilla’s Senior Director of Firefox Engineering, Jonathan Nightingale and understand some of the technical issues around the new OS.


Key points from the video, as well as speaking to some of the other Mozillians:

  • The device could be very cheap, $60 for a SIM-free device is a figure I heard. Needless to say, if they can achieve a price point like that, it will be music to carriers’ ears and simultaneously light up the HTML5 developer community. At 10:00, Jonathan mentions that Telefonica sees the opportunity to produce a smartphone that’s “interactive, smooth, high quality, with a good app selection, that is one-tenth the cost of an iPhone”. From what I’ve heard, this is because of two factors: the target hardware is open and patent-free, and the HTML5 runtime is built over a thin Linux core, so various layers are eliminated.
  • It boots fast, about 14 seconds (I thickly say 4 seconds in the video but I was counting from the splash screen…my argument is invalid). Bearing in mind this is a very early concept, the final version will hopefully be even faster, though it’s possible extra services may make it slower too.
  • Everything is HTML5. The launcher, dialler, camera, gallery, et al. And what’s more …
  • … On the dev build, as Jonathan demos, View Source is available everywhere! On all the apps, and that includes those “system” apps above (launcher, dialler, etc).
  • The Mozilla Marketplace is not exclusive to B2G. It will run on other devices where Firefox runs, like Android. (It turns out that Android apps can be permissioned to add multiple URL shortcuts to the Android launcher, so once a user installs the market, they can install apps directly on the launcher.) This is a little like the Chrome Web Store, which is the app distribution mechanism for ChromeOS, but also works on regular Chrome across the different OSs it runs on.
  • Mozilla’s keen to cultivate an ecosystem of multiple marketplaces with different verticals and business models. They offer their own marketplace, but will happily embrace others.
  • No immediate plans for extensions or a NodeJS service layer like WebOS runs, but Moz is very open to feedback via the standard channels.

PPK argued in last week’s Web Ahead we may well see a shift away from Android in the next year as carriers come to terms with Google’s Motorola acquisition. It’s not for certain, but I did hear people echo this view at MWC. Boot2Gecko will be a key part of the current movement among the smaller OSs to embrace HTML5 developers. And if the price point means developers can easily get their hands on these devices, as opposed to the usual suspects who are awarded freebies by marketing departments, it stands a real chance in this market.

HTML5 developers should also check out Christian’s Boot2Gecko write-up for more info, as well as the official Boot2Gecko Wiki.

Interview: Intel’s HTML5 Playground and AppUp

I’m back from Mobile World Congress, where I was invited by Intel to speak on HTML5 and demo HTML5 as part of their AppUp programme. There, I had a chance to work with alongside Intel’s Daniel Holmlund, who recently released a new playground for HTML5 developers I’m lucky to get to a number of conference and my new year’s resolution is to record more talks with people there, so here’s a quick chat and demo about HTML5 Playground and the AppUp programme.


The playground is similar to what you’ve seen with tools such as JSFiddle, JSBin, Tinker.io, and HTML5Rocks’ own HTML5 Playground, with the key focus here being a “sharing and learning tool” in Daniel’s words. You can see here in the V1 release the pre-fab code snippets are prominent, ie it’s not just a blank slate. There’s also the ability to share code snippets, powered by AddThis.com.

My own take on it is coloured by a recent overdose of StackOverflow podcasts and talks, where both founders have impressed on me the critical influence on user expectations wielded by even the most trivial of features. As Joel Spolsky explained at Hacker News London, a UI is not just about attracting the most eyeballs; it’s about setting up first impressions to draw in some people while — if it does its job — repelling others. In other words, it’s about signifying which class of people the site is intended for, much as any social group has multitude signifiers of whether you could be one of them. (A sly comment in one of Joel’s post put me onto The Culture Code a while back. Very relevant topic in this context.)

Back to the billabong, my point being that this playground very much reflects the goals of sharing and learning by (a) prominently including the library of pre-built examples; (b) always showing the Share button (though there’s a good argument to make that even more prominent).

Full disclosure: Intel is a consulting client (there’s no expectation to cover anything Intel-related on this blog, it should be noted). I’ll be sharing details of the slides and video et al from the conference once they’re out in the wild.