Migrating WordPress

After another WordPress migration [1], here’s my 2p on WordPress migrations: Don’t even think about following WordPress’s standard backup-and-restore instructions. The Duplicator plugin captures everything in one go, including comments, themes, etc as it dumps both the database and the file structure in a single archive. Also, you can easily scp the dump onto the new machine; whereas the usual procedure requires rebuilding WordPress, installing a plugin, and then worrying about the dump being too big when you try to upload it over HTTP (as there’s still no obvious way to scp and restore WXR files, really!)

Duplicator’s restore phase isn’t too clear from the docs, but I found the answer here. You don’t have to manually install WordPress. You simply create a new temp folder somewhere under web servers public/ folder, open your browser, and navigate to the install.php file there in this folder. It will do the rest for you!

  1. Now that Linode has dropped its minimum box to $10/month (and a decent spec at 1GB RAM with SSD storage), I decided to consolidate and move the utility box on Linode instead of DigitalOcean.

Logging the caller

It’s so useful to automagically log caller line number and so on. Most languages make it possible via hack involving throwing an exception to yield a stack trace. Some languages explicitly provide this info. In Ruby, it’s possible with the caller array.

Here’s how I used it just now:

  1. def logg
  2.   caller_info = caller[0].gsub! /^.+\/(.*)\.rb:(.*):in `(.*)'/, '\\2:\\3:\\2'
  3.   Rails.logger.debug "[#{caller_info}] - #{id}. Thread#{Thread.current.object_id.to_s(36)}"
  4. end

This will output caller_info in the format: [series_feed:fetch:123]. Which is the file:method:line_number. It’s derived, via the initial regex, from the slightly less log-friendly caller string, path/to/series_feed.rb:123:infetch’.

How to use different favicons for development, staging, and production

It’s useful to have different favicons for each environment during development, as Pamela recently pointed out. Here’s how I do it.

First, generate the images. Most graphics editor have some kind of Colorize tool and if you do it often, use ImageMagick to change colors programmatically. If you’re lucky enough to have a green brand logo, a cool combo would be traffic lights. Justsaying.

Now, set the favicons according to your environment. Should you just use /favicon.ico location or point to the location in head? The answer is both. /favicon.ico is useful for third-party plugins whose HTML you don’t control, as well as quick hack pages you can’t be bothered configuring with proper metadata. A link tag in html head is useful for revving the icon when it changes, to bust any cached version. So use both.

To set favicon in HEAD, ensure the path is dynamically generated instead of hard-coded:

  1. %link(rel="icon" href="#{favicon_path}" type="image/x-icon")

The path is generated like such as:

  1. def env_suffix
  2.   Rails.env.production? ? '' : "-#{Rails.env}"
  3. end
  4.  
  5. def favicon_path
  6.   asset_path "favicon#{env_suffix}.ico"
  7. end

It’s arguably bad form to use “production” strings in production, so the env_suffix nonsense above will hide it. If you don’t care about that, just call your production icon favicon-production.ico and save yourself a little hassle.

As mentioned earlier, we also want just /favicon.ico to exist. A quick way to do that is copy it when the apps starts or define /favicon.ico as a route and serve the file as favicon_path, defined above, with sufficient cache expiry time (e.g. 1 day). e.g.:

  1. def favicon
  2. path = ActionController::Base.helpers.asset_path "favicon#{env_suffix}.ico"
  3. send_file path, type:"image/x-icon", disposition:"inline"
  4. end

For bonus points, you might want to use similar techniques to provide overriding stylesheets for development and staging. Then you can introduce a text label or change background color, etc.

Rails quirk: Adding to DateTime

Just came across a weird one with DateTime. Adding an int value will increment by a day, not a second as you may expect:

  1. $ x=1
  2. 1
  3. $ x.class
  4. Fixnum
  5. $ DateTime.new(2000, 1, 1, 0, 0, 0)+x
  6. Sun, 02 Jan 2000 00:00:00 +0000

But adding by a second is possible using explicitly 1.second. The strange thing is both inherit from FixNum and essentially act as the same number. So if you want it to mean seconds, one way to achieve it is use “x.seconds”.

  1. $ x=1.second
  2. 1 second
  3. $ x.class
  4. Fixnum
  5. $ DateTime.new(2000, 1, 1, 0, 0, 0)+x
  6. Sat, 01 Jan 2000 00:00:01 +0000

How HTTP client libraries should support redirects

I don’t think I’ve ever come across an HTTP library (and I’m including XMLHttpRequest here) that covers all bases on redirects. In the hope of pushing the field forward, here is my wishlist. I’m limiting this to just GETs as it gets even more complicated with the mutating actions. (Several browsers do XMLHttpRequest wrong in the case of POST redirects.)

  • Follow redirects by default (it’s the most typical need after all), but provide an option to turn that off.
  • Allow a limit on redirect-following recursion, to prevent too much redirecting [1].
  • Prevent infinite redirect cycles. Which arise due to misconfiguration or malicious denial-of-service efforts [1].
  • Report back the chain of redirection, with corresponding status codes (ie was it permanent?).

It’s really the last point that’s universally missing. For services like Player FM which need to retain a record of permanent redirects, it’s necessary. The lack of this features means having to write custom code to follows redirects. Which isn’t a giant burden, but it’s error-prone, a pain to test, and has to be done for every kind of resource.

  1. The recursion limit requirement has the side benefit of preventing infinite redirect cycles, so that alone may be adequate for a library. But it’s worth treating a separate requirement because: (a) it can be even more optimal if handled separately (ie it can terminate even before the limit is reached); (b) if the library does rely on some limit to prevent infinite recurison, the library should nevertheless enforce some maximum, e.g. 100, and also set some sane default, e.g. 5.

Photo Credit: Smabs Sputzer via Compfight cc

Key-based cache expiry: A developer’s primer

Key-based cache expiry is a powerful pattern for efficient and reliable caching. I’ve been using it on for some time now after reading DHH’s original post and it’s worked well. This post explains the more conventional approaches to explain how key-based caching arises as a fruitful alternative.

So then, how does caching normally work, and what’s so bad about that anyway?

Clear after some time. Sure, you can say “this stock price table expires in 5 minutes” and then re-render it every 5 minutes (or longer if no-one immediately requests it). The problem is, you’re often making a big compromise on both ends. On the one hand, you’ll end up with stale results when the stock price changes during this cache window, e.g. if it changes 2 minutes after you serve it, you’re sending wrong data for another 3 minutes. And on the other hand, what if it only changes once a day? Most of the time you’re needlessly re-retrieving data, re-calculating and re-rendering it. Wasteful.

Clear manually. Seeing the problems of time-based expiry, you could be tempted to just keep the cache up to date manually. So when you’re notified of a price change, you explicitly update the value in the cache. This will certainly be more efficient, but the logic gets pretty complex as you scale up. You end up with NxM code complexity as all N bits of code needs to be aware of which M cache items could be affected by changes.

So one technique is easy but inefficient and inaccurate; the other is efficient and accurate, but hard. Let’s see if we can find a sweet spot which is easy AND efficient AND accurate.

Don’t clear. With key-based cache expiry, everything’s put there forever and never cleared. How is that possible? Because it takes advantage of the cache’s built-in automatic expiry mechanism. We must use a cache, such as Memcached or Redis, which supports some kind of expiry based on least-recently-used (LRU) or similar selection. In that sense, we have reached our application’s sweet spot by offloading complexity to the cache framework.

How this works is the keys must reflect a version or timestamp of the object being cached, e.g. a key might be “article-123-201404070123401″, generalised as “type-id-timestamp”. Normally clients won’t request the object by version, so you’ll need to do a quick lookup to find the object’s latest version or timestamp [1]. Then you retrieve it from the cache, or write through to the cache if it’s not already present. And the important thing is you write to it with infinite expiry.

The technique can be used at many levels – HTTP caching, memcached, persistent databases. I first asked about it here and I’ve since used it effectively in production on Player FM’s website and API. It’s certainly how various frameworks handle asset serving (ie compiling CSS with a timestamp and so on), and it’s also an official part of Rails 4, and I expect other frameworks in the future. So it’s a pattern programmers should be familiar with in an era where performance is held in high esteem, and rightly so.

  1. Looking up timestamp or version is work you don’t have to do with manual expiry, so it’s again a trade-off that makes this slightly less efficient, but a lot easier. Furthermore, if you arrange things right, you can actually have clients request the latest version/timestamp for all but the original resource (when they are requesting several resources in succession).

Photo Credit: Paolo Margari via Compfight cc

Load-balancing Rails with Nginx

Well this was some fine undocumented black magic. I’ve got Player FM behind a load balancer now, using the following Nginx config. I’ll explain some more about the overall upgrade later.

# App load balancer

upstream playerhost {
 server 192.168.1.1;
 server 192.168.1.2;
}

server {

  server_name playerhost;

  location / {

    proxy_set_header Authorization "Basic blahblahblah==";
    proxy_next_upstream http_500 http_502 http_503 http_504 timeout error;

    # http://stackoverflow.com/questions/16159998/omniauth-nginx-unicorn-callback-to-wrong-host-url
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Client-IP $remote_addr;
    proxy_set_header X-Forwarded-For $remote_addr;

    proxy_redirect http://playerhost http://player.fm;
    proxy_redirect https://playerhost https://player.fm;
    proxy_pass http://playerhost;

  }
}

Notes

  • I recommend using a distinct name for the backend (I’ve used “playerhost”). Most tutorials use the non-descript “backend”, but it’s a useful indicator if something’s going wrong as you’ll see this ID pop up in URLs and HTML content.
  • You don’t have to use basic auth for the backends. You could just firewall them off the public internet or deal with them being public. Being public is not clean and will cause the site to be hit by search bots and so on unnecessarily. But closing it off altogether is not ideal, because it’s useful for diagnostics to go into the backend servers. So I expose them via basic auth. The “blahblahblah” is base64 of your basic auth username:password.
  • The site mostly worked without set_header, but I had some weird oAuth redirect problems and occasional HTML problems. In both cases, I’d be seeing that “playerhost” URL instead of the actual domain. This fixed it.
  • The proxy_redirect commands were an earlier attempt to fix https redirections, and worked, but still left the problem mentioned in the previous point. It may not be necessary at all, after adding the set_headers. I haven’t tested that yet.

Photo Credit: Graham Cook via Compfight cc

Why my Nexus is fantastic and why my next phone won’t be a Nexus

My Nexus 5 is great. It runs pure Android, it’s super-fast with Kitkat, screen is great, and it was great value as a SIM-only purchase. I can be confident it will always be running the latest Android too, which means not just more toys but improved security. And it’s almost a necessity as an Android app publisher to own a Nexus device, for testing purposes (pure Android is the starting point for all other devices/OSs to deviate from, so Nexus with stock Android is the least deviation from the sum total of all things Android).

So why will my next phone not be a Nexus?

One word: tethering. Many people claim they can make a day without charging their phone. I can too, with the right settings. But not if I want to tether. Tethering drains battery hard, not surprising that turning your phone into a modem/router would do that. Not that Nexus battery is bad at all, it’s probably about average for a high-end. But forget about lasting a day when tethering.

The thing is, you see these products like “Kindle+3G”, “iPad with data plan”, and think why bother. I have true-unlimited 4G for ~ £20/month (thanks Three) and a phone capable of sharing it with any device I damn please. As well as Kindle e-reader and tablets, I’m sometimes testing other phones and devices which either aren’t phones (e.g. iPod touch) or are cheapo PAYG phones without a data plan. Sometimes others need to grab a connection or I need to work on PC too. All of these things become full-fledged smartphones through the magic of tethering.

Similarly, if I go abroad and get a local SIM, that’s another time I really want to tether. I can bypass silly hotel internet altogether by getting a local SIM and sharing the connection.

Bottom line, I want to tether without having to worry my phone won’t last the morning. So a phone without replaceable battery doesn’t cut it. I seriously miss being able to carry a battery in my pocket and another in my bag, pretty much guaranteeing there will always be charge. Sure there are various portable ways to charge on the go, I know them well and use them all the time. It’s not the same as having portable batteries. AKA Sod’s law ensures you won’t have it when you need it.

I only wish the manufacturers would embrace it and provide front-loading slots instead of forcing me to rip off a fragile plastic lid every day. And support hot-swapping (which IIRC Nexus S did, but nothing since).

So my next phone is likely to be a Samsung or HTC, one with portable battery and plenty of charge. Or a Nexus if it does indeed support battery changing. But it seems the priority is understandably on keeping the product simple and as cheap as possible. That means a single battery for life.

Linode API – Reference Data

Reference data in Linode API (and things built on top of it like Ansible’s module) isn’t really documented anywhere, so I’m dumping some of it here using the Ruby Linode gem. I assume this stuff is the same for all users (theoretically some plans or data centres might vary, but I doubt it). Of course it will go out of date as Linode adds new options, so feel free to fork it and re-run using the setup code provided there.

Easiest to read the raw Gist here