Defer and Recur with Rails, Redis, and Resque

I’ve put off some scaling related issues about as long as possible, and am now proceeding to introduce a deferred-job stack. I’ll explain what I’ve learned so far, and with the caveat: this isn’t in production yet. I’m still learning.

What it’s all about

Tools like Resque let you perform work asynchronously. That way, you can turn requests around quickly, so the user gets something back immediately, even if it’s just “Thanks, we got your request”, which is nicer than the user waiting around 5 minutes, and ensures your server doesn’t lock up in the process. Typical example being sending an email – you don’t want the user’s browser to wait while your server connects elsewhere and fires off the email. Other examples would be fetching a user’s profile or avatar after they provide their social profile info; or generating a report they asked for.

So you set up an async job and respond telling the user their message is on the way. If you need to show the user the result of the delayed job, make the clien polls the server and render the result when it’s ready. More power XHR!

The simple way to do this

The simple way, which worked just fine for me for a long time and I’d recommend for anyone starting, is a simple daemon process. Basically:

  1. while true
  2.     if (check_database_for_condition)
  3.       do_something
  4.     sleep 10
  5.   end

The fancy way

The problem with the simple way is it can be hard to parallelise and monitor; you’ll end up reinventing the wheel. So to stand on the shoulders of giants, go install Redis, Resque, and Resque-Scheduler. I’ll explain each.

Redis

Redis, as you probably know, is a NOSQL database. It’s been described as a “data structure server” as it stores lists, trees, and hashes; and assuming Knuth is your homeboy, that’s a mighty fine concept. And it’s super-fast because everything is kept in memory, with (depending on config) frequent persistence to disk for durability.

Resque

Resque is no sneezing matter either, being a tool made and used by GitHub, no less.

Resque uses Redis to store the actual jobs. It’s worth explaining the main components of Resque, because I’ve found they’re often not defined very clearly and if you don’t understand this, everything else will trip you up.

Job. A job is a task you want to perform. For example Job 1234 might be “Send welcome email to [email protected]”. In Resque, a job is defined as a simple class having a “perform” method, which is what does the work [1].

Queue. Jobs live in queues. There’s a one-liner config item in the Job class to say which queue it belongs to. In a simple app, you could just push all jobs to a single queue, whereas in a bigger app, you might want separate queues for each job type. e.g. you’d end up with separate queues for “Send welcome email”, “Fetch user’s avatar”, and “Generate Report”. The main advantage of separate queues is you can give certain queues priority. In addition to these queues, you also have a special “failed” queue. Tasks that throw exceptions are moved to “failed”; otherwise the task disappears.

Worker. A worker is a process that runs the jobs. So a worker polls the queues, picks the oldest jobs off them, and runs them. You start workers via Resque’s Rake task, and in doing so, you tell it which queues to run. There’s a wildcard option to run all queues, but for fine-grained optimisations, you could set up more workers to run higher-priority queues and so on.

An important note about the environment. Rails’ environment can take a long time to start, e.g. 30 seconds. You clearly don’t want a 30-second delay just to send an email. So workers will fork themselves before starting the job. This way, each job gets a fresh environment to run off, but you don’t have the overhead of starting up each time. (This is the same principle as Unicorn’s management of Rails’ servers.) So starting the worker does incur the initial Rails startup overhead, but starting each job doesn’t. In practice, jobs can begin in a fraction of a second. You can further optimise this by making a custom environment for the workers, e.g. don’t use all of Rails, but just use ActiveRecord, and so on. But it’s probably not worth the effort initially as the fork() mechanism gets you 80% there.

Resque-Scheduler

For many people, Resque alone will fit the bill. But certain situations also call for an explicit delay, e.g. “send this email reminder in 5 days”; or repeat a task, e.g. “generate a fresh report at 8am each day”. That’s where Resque-Scheduler comes in [2].

Resque-Scheduler was originally part of Resque, so it basically extends the Resque API. The “scheduling”, i.e. repeated tasks, are represented as a Cronjob-like hash structure and can be conveniently represented in a YML file.

Delayed jobs are created by your application code. It’s basically the same call as when you add the job directly to Resque, but you need to specify an additional delay or time argument.

The cool thing is jobs are persisted into Redis, so they will survive if the system — or any components (Redis/Resque/Resque-Scheduler) — goes down. I was confused at first as I thought they were added to some special Resque queue. But no, they are actually in the Redis database. I found this by entering keys * into Redis’s command-line tool (redis-cli), which yielded some structures including “resque:delayed:1372936216”. When I then entered dump resque:delayed:1372936216, I got back a data structure which was basically my job spec, ie. {class: 'FeedHandler', arg: ['http://example.com'].

So Resque-Scheduler basically wakes up every second or so, and does two things: (a) polls Redis to see if any delayed jobs should now be executed; (b) inspects its “schedule” data structure to see if any repeated jobs should now be executed. If any jobs should now be executed, it pushes them to the appropriate Resque queue.

Notes

  1. Conceptually a job definition is little more than a function definition, rather than a full-blown class. But being a class is the more Rubyesque way to do it and also makes it easy to perform complex tasks as you can use attributes to hold intermediate results, since each job execution will instantiate a new job object.

  2. I evaluated other tools, e.g. Rufus and Clockwork, but what appeals about Resque-Scheduler is it persists delayed jobs and handles both one-off and repeated jobs.

What Everyone Should Know About REST: Talk at Web Directions Code

Here are the slides:

Slides: What Everyone Should Know About REST

Sketchnotes

Thanks UX Mastery for the sketchnotes, they are awesome! (Seriously, I would be much more swayed to speak at any conference with sketchnotes because it’s a straightforward permanent memento, a better snapshot than slides or video.)

Overall, it was great to be associated with another fine Web Directions conference and the Melbourne Town Hall venue was amazing. I only regret that we were so busy scrambling on the Android app, after launching just a few days earlier, to be around the whole time. But this being my hometown — I’ll be back!

Talk Structure

I spoke at Web Directions Code on Friday, a talk on REST. I’ve been putting a lot of this into practice lately, and the talk was really an attempt to convey the main practical things every developer should know. The structure was:

  • Everyone should know about REST because it’s not just about websites anymore. Devices, whether computers, fridges, or wearable glasses – are connected, and device-to-device communication happens with web standards, i.e. HTTP. The talk covered three things about REST: Simplicity+Consistency; Security; Caching.
  • Simplicity+Consistency: Emphasising Developer Experience (#devexp) was a way to frame the general concepts, ie URLs, HTTP methods, response types.
  • Security: How the web is becoming SSL-only, and various authentication schemes. I referenced the latest Traffic and Weather, which has a good discussion on this.
  • Performance+Scalability: Mostly about caching. I’ve been musing on REST caching quite a bit for Player FM’s API (most recently thinking about a kind of reverse patch protocol, where the server can send out diffs that get cached), and explained some of the standards and tricks for squeezing efficiency out of the network.

What Wasn’t Covered

  • I didn’t go into the REST acronym or the general theory of REST as an architectural pattern arising from specific forces.
  • SSL and caching. Good Twitter conversation afterwards about this point, that you can’t cache in the middle of an SSL connection. The answer is to split the connection in the middle and run SSL on either side, with a trusted cache seeing plain-text in the middle. This is how Cloudflare works, and the CEO Matthew Prince chimed in to say it will be free soon. (At least, SSL from client to Cloudflare.) So that means the SSL-protected web could triple overnight.

JavaScript swims downstream with the web

Roy Fielding’s original REST dissertation (published in 2000) has an interesting section on Java versus JavaScript, which I’ve not come across before and has certainly stood the test of time. In particular, the biggest benefit is explained to be nothing more complicated than performance, obviously a huge deal these days.

Extract below with obligatory PSD artwork. Emphasis mine.

[Background: Speaking about REST at Web Directions Code soon and finding myself on a long day of interstate flights, I bit the bullet and finally read Roy Fieldings’ original REST thesis cover-to-cover. And if anyone has a good way to export highlights from a personal document in Kindle, please let me know as it’s apparently unsupported.]

6.5.4.3 Java versus JavaScript

REST can also be used to gain insight into why some media types have had greater adoption within the Web architecture than others, even when the balance of developer opinion is not in their favor. The case of Java applets versus JavaScript is one example.

The question is: why is JavaScript more successful on the Web than Java? It certainly isn’t because of its technical quality as a language, since both its syntax and execution environment are considered poor when compared to Java. It also isn’t because of marketing: Sun far outspent Netscape in that regard, and continues to do so. It isn’t because of any intrinsic characteristics of the languages either, since Java has been more successful than JavaScript within all other programming areas (stand-alone applications, servlets, etc.). In order to better understand the reasons for this discrepancy, we need to evaluate Java in terms of its characteristics as a representation media type within REST.

JavaScript better fits the deployment model of Web technology. It has a much lower entry-barrier, both in terms of its overall complexity as a language and the amount of initial effort required by a novice programmer to put together their first piece of working code. JavaScript also has less impact on the visibility of interactions. Independent organizations can read, verify, and copy the JavaScript source code in the same way that they could copy HTML. Java, in contrast, is downloaded as binary packaged archives — the user is therefore left to trust the security restrictions within the Java execution environment. Likewise, Java has many more features that are considered questionable to allow within a secure environment, including the ability to send RMI requests back to the origin server. RMI does not support visibility for intermediaries.

Perhaps the most important distinction between the two, however, is that JavaScript causes less user-perceived latency. JavaScript is usually downloaded as part of the primary representation, whereas Java applets require a separate request. Java code, once converted to the byte code format, is much larger than typical JavaScript. Finally, whereas JavaScript can be executed while the rest of the HTML page is downloading, Java requires that the complete package of class files be downloaded and installed before the application can begin. Java, therefore, does not support incremental rendering.

Roy Fielding

Testing HTTPS Locally

As I’m migrating the player over to HTTPS, one challenge is partial content, leading to an incomplete padlock and strikethrough domain warning like this:

And the harsh but fair warning, upon inspection: “However, this page includes other resources which are not secure. These resources can be viewed by others while in transit, and can be modified by an attacker to change the look of the page.”

So to fix this locally, a nice setup for Ruby/Rails devs is Pow + Tunnels. Both are super-simple to setup.

Pow is a local server, so if you usually run Rails on http://localhost:3000, you can one-click install Pow and all you need is to symlink your Rails folder to ~/.pow. Then you have a local server, sans port, like http://player.dev. Then, just install Tunnels and it will simply pipe https://player.dev into http://player.dev.

Now you can open Chrome devtools’ resource tab and fish out any connections which are still https. Ideally host them locally, or at least change the links to https ones at possible loss of cache performance. Still, did you see various posts recently about ISPs injecting crapware script tags into people’s pages? OMG I know right! Seriously, https-everywhere is where the web is heading. Even public sites aren’t immune.

Post hoc ergo propter hoc: Posterous Moved

For some years, I ran a little posterous blog called Mini Software As She’s Developed, the little brother to this blog’s great uncle’s step-cousin. It was effectively a pastebin to throw random things at. Now that Posterous is to be decommissioned, I’ve migrated it to an archival blog on WordPress.

It’s archival because these days, my glorified pastebins you shouldn’t subscribe to are:

  • My Gist Stream.
  • My Notes Community. You see, I’ve stumbled onto a nice Posterous replacement, which is a Google Plus community. Google Plus pages can work that way too, but they are more cumbersome to deal with, requiring a new window each time. G+ is not a Posterous replacement in the sense that you can’t mail things to it, or tweet them etc, but it’s quick to share stuff and works nicely from mobile. Plus one thing Posterous never sorted out was commenting – it required a Posterous login. Whereas anyone on Plus can leave a comment. So it’s turned into a nice way to create a public thread, but one that won’t spam followers when published.

Blinking WebKit

  • Speed. When Alex Russell talks about greater speed [1], I take it fractally. At micro level, it means actual day-to-day web development and debugging is faster; and at macro level, it means browsers and web standards move faster. Google works the same way; it is a company which cares deeply about speed; at macro level, that means pushing Kurzweil’s broader interpretation of Moore’s Law to its limit, and at micro level, it means great victory for every nanosecond that can be shaved off a search query.

  • Inevitable. The writing has been on the wall for years. Chrom{e/ium} has been heavily driving WebKit and it’s only natural they should want to lead the project. Cutting-edge WebKit is already there on desktop and mobile; in the future, it will need to be there in more contexts, i.e. Android webviews, Google TV or what becomes of it, Glass, cars, etc.

  • Dart. I can’t get a grip on how much Dart is growing, I’m too out of the loop. But if it is indeed growing to the point that it gets to survive and be blessed internally, it will be part of Blink. No question.

  • Safari. I’ve read some people say to the effect “you’re doing it wrong if not already testing on Safari as they’re already different”. Well yeah if you’re writing a mission-critical trading app. But let’s be honest; this business about testing on all browsers comes with a big wink and a sizeable nudge. Most of us can and do get by testing only occasionally on Safari. Even more so for Windows developers who don’t even have access to a modern Safari. I don’t see Apple adopting Blink anytime soon, I’m not even sure the importance of this fork will filter up to Apple’s seniors for some time. And this is a good old fashioned fork; WebKit and Blink will be significantly different. So the net effect for developers is more testing on Safari. And compensated by less testing on …

  • …Opera. My heart sank a little for Opera on reading this news; so it’s good to know Opera was in on the secret. If not when they made the decision to adopt WebKit, then at least some point before the Blink news dropped. Blink will certainly be stronger for Opera’s contributions.

  • Samsung. Samsung has to be considered a major part of today’s browser ecosystem. They get to pick the browser that goes into most smartphones after all, and it’s no secret they are on a collision path with Google. Last night’s news of a major collaboration with Mozilla (on Servo) is more evidence of that. Should Samsung start shipping Firefox as the default browser, the web really will have four major mobile engines (including IE here). It feels like battle lines have been drawn, but that’s probably more about the coincidence of timing. Also worth mentioning Amazon as a similar company with potential to grow into a major influence on the web ecosystem, via Silk. One can assume they will adopt Blink.

  1. http://infrequently.org/2013/04/probably-wrong/

Shorthand Parameters

Here is a weird abuse of default variable values to support shorthand variable names. It’s valid Ruby.

  1. def area(r=radius) {
  2.   Math::pi * r * r
  3. }

Simple example, but you get the point. It lets you tell the external world what a parameter is all about, but keeps the implementation shorthand. Obviously it’s just a simple example here; parameter names can be much more verbose than just this example and functions can be longer, so you don’t want to keep repeating a long name. For example:

def damage_level(force_exterted_by_car=force) { force = 0 if force < 0 acceleration = mass/force … } [/ruby]

Now you might say “just declare it in the first line”, but I prefer small code and there could be several such lines.

You might say “mention it in a comment”, but I prefer self-documenting code. Comments go out of date and clutter up code. (Strictly speaking, the long name here is a comment, but it’s more likely to be maintained.)

[Update: I don’t often mention Pi, but when I do, it’s on March 14: Pi Day. Thanks to the reader who pointed it out!]

Revoking OAuth Tokens From Google, Twitter, etc

The URLs below let you manage and revoke permissions you’ve given to third parties via Google, Twitter, Facebook, etc. It’s not only useful for security, but also for testing while developing such tools. By deleting the connection, you can see what a user will see the first time they connect.

Mainly writing this because I keep searching for these things and don’t have much luck (as I forgot about MyPermissions). Being personalised URLs, they don’t show up in searches (which is a wasted opporunity, since they are all static and could have just been public placeholders). Hopefully this post will show up in the future when I search for “revoke OAuth Tokens”. You can find links to more of these services on MyPermissions.

A Bash Logging Utility

With a long-running script, it’s convenient to see checkpoint log messages indicating what stage it’s at and how long it’s taken.

Most scripts simply run date to show the boring long date format: Fri Mar 29 21:07:39 MST 2002. Info overload! You don’t want to know what month it is, whether you’re in the middle of a weekend, or what timezone you’re in! More to the point, you want to know how much time has elapsed, not what time it is now; you want to know the script’s age.

So here’s a little utility to make it easy. Just call “age” and it will output time since the script began in 00:00:00 format.

I also made another function “announce” which you can use to announce the current function is running. With larger bash scripts, I tend to break them into functions with a list of calls at the bottom; so I can quickly bypass unnecessary crunching by commenting out the call. “announce” makes it easy to see which is running. And if you wanted, you could easily automate announcing for each function…making aspect-oriented Bash the place to be.

No, let’s not use that date format

Doing the rounds is XKCD’s endorsement of the ISO 8601 date format. Let’s avoid that, because as another XKCD reminds us, you don’t just invent new standards in the hope of wiping out the old ones.

I don’t know how serious the proposal is, but I’ll bite:

  • 2013-02-27 is used by no-one; so it will confuse everyone.
  • Real people don’t use leading zeroes.
  • It’s still ambiguous. Given dates are already messed up, there’s really no reason to assume 2013-02-03 is the logical MM-DD order.

No, the real answer is either include the month name (or abbreviation), or (in a digital context) use the “N days ago” idiom. (Note that “N days ago” does suffer from one major issue, which is it goes stale if caching the content.)

Sure, if the context is filenames or something technical, use this format or just plain old 20130227 (it will sort nicely (https://twitter.com/CastIrony/status/307014830752149504)) and I often do use this format for backups. But for humans, stick to what they know.