Google to embrace Swift?!!

In other news, Microsoft to embrace Linux. Hahahaha ha ah oh wait.

NextWeb reports Google’s considering Swift as a first class Android language. I’m not surprised companies like Facebook and Uber are embracing Swift as it’s sufficiently open and highly attractive in an environment where iOS is king.

But no, Google won’t embrace it. Just because something is open source doesn’t mean everyone has equal influence. There’s still control at the top of the repo and this is why Google forked Webkit, an effort to control its own destiny instead of relying on the very same company that controls Swift.

Furthermore, Google and Android is still engaged in a tired, ongoing, battle with Oracle over Java, another language that is – “for varying definitions of” – open. And Oracle isn’t even a direct competitor.

It’s true that there’s a lot to be said for a more dynamic, scripty, language on Android. While Android Studio has done a lot to improve the developer experience for Android devs, a lot of work done is exploratory UI, something that a language like Swift can help with.

If Google were to embrace a dynamic language, and assuming they don’t start from scratch, there are really only three contenders: Go and Dart, since Google completely control them (more likely Dart as it’s more suited to UI). And JavaScript, since it’s immensely popular and under no one company’s control (and as a bonus, Dart is designed to compile nicely to JS, so Google can still support it as a higher level alternative). Just as Swift has made iOS a much more approachable platform for casual developers, embracing the world’s most popular language could be a nice boost for developer traction in the Android world.

Progressive Web Apps have leapfrogged the native install model … but challenges remain

I visited Google’s Progressive Web Apps shindig the other day and I’m pleased to see the progress browsers have made towards appiness in the past 2 years. Much of what I wished for in frankly the web’s darkest days is now available in 3 major browsers and counting.

The install model has truly gone from being a non-starter to something that more closely maps the needs of users than any native platform has achieved. Here I will reflect on the current state of the web as a platform for apps, identify some remaining concerns, and propose where the biggest wins will come from.

The web can do apps

While the HTML5 era (2008-2012, say) introduced many conventional app components, they emerged in parallel with the mobile revolution. I don’t say “so-called revolution”. It was an actual revolution in every way, changing forever how apps are designed, developed, and distributed. HTML5 was already ushering in a flood of new tech, it was never going to be possible to make it all mobile-savvy at the same time, and this led to a world where people went crazy for apps.

The progressive web movement plugs the gap in several ways:

  • Push notifications. These are lifeblood for many apps. Critical to the functionality of messaging apps like Skype and Slack, and – in a world of fickle users and high churn rates – vital to retention for all apps.
  • Background processing. Doing stuff when the user doesn’t have your app open is also vital for a modern app’s functionality, performance and offline capability. This is about the app acting as your digital assistant, not just something you interact with for the time it’s on the screen in front of you.
  • Low-level APIs. As part of the extensible web manifesto, developers now get access to the low-level innards of the web stack. This not only helps the standards to evolve, but lets developers deliver useful functionality unanticipated by committee-driven standards processes.

Furthermore, it comes at a time when browser performance is strong and web debugging tools are built into all the major browsers and have become stupendously useful. All of which means, it’s now possible to replicate the functionality and interface of many popular native apps.

But how will users install web apps?

Until progressive web movement came along, websites never had a chance on mobiles. Browser bookmarking features were about the same as in 1995 – just a flat list. Users didn’t know how to install to homescreen and even with libraries prompting it, it gave no confidence the site would work offline. Offline tech itself, and the website couldn’t do anything in the background as mentioned above above.

Now – with progressive web apps – two things have changed. Firstly – at least on Android – web apps have been elevated in their presence. The task switcher presents each recent website alongside each recent native app; they are all equivalent. The traditional 38-step “add to homescreen” process has been replaced by a simple menu item in Chrome. And most importantly, the browser will proactively prompt an install.

The progressive web’s install model rocks

A common argument for native apps has been the importance of app stores (insert $legally_acceptable_synonym for app store on your platform of choice). The web’s counter-argument has heretofore been either “Ah but the web has search engines” or “Ah but there are x billion apps on the app store and only the top 10 get any installs”. Neither of these arguments hold much water for me.

SEO is a real thing and search rankings are as much or as little a meritocracy as app rankings are. For every startup blowing $5K a day on Facebook app install ads, there’s another startup paying for fake forum posts in the hope of Google juice. The “long tail” argument also applies just as much to the web.

So how does “progressive web” improve things? By letting users progress from a fly-by visit to a fully installed app, on their own terms.

The conversion funnel from “non-user” to “user” for a traditional app looks like this:

  1. User discovers app.
  2. User installs app. Waits a minute or two for download. App is installed.

Getting users from (1) to (2) is extremely hard for developers. Most users don’t want to clutter up their phone with hundreds of apps, don’t like to go through the hassle of downloading the app, and don’t want to feel the remorse of installing a lemon (even a free lemon). Pulling off a large install base relies on a difficult-to-achieve store ranking, a viral loop that is fleetingly rare in practice, or several dollars of ad spend per paid install.

The progressive web install looks like this:

  1. User discovers website.
  2. User kicks tyres. Grants one or two permissions.
  3. Re-visit or re-stumble on website a few times.
  4. Say yes when browser prompts homescreen install (or explicitly install it).

This is a much more logical transition. Instead of making the install decision based on the store listing, the user makes the decision based on actually interacting with the app. At no point are they obligated to install it, but as they gain confidence in it, they can decide to do so. From the developer’s point of view, it’s easier to win long-term users if you have a product that’s compelling (and if not, why do you care about installs? You will lose users anyway if you don’t have good retention).

Admittedly, going from (3) to (4) is still hard. If the user hasn’t yet installed the app, how confident can you be that they will re-visit your site often enough to be prompted? Part of the answer comes from background notifications, which means the user can still be engaged even without that install. As well, if the user’s social contacts keep recommending the same app, it will likely lead to an install prompt. Compare that to a native share, which would often lead to nothing unless the content was amazingly compelling.

Indeed, it’s likely developers will care less about installs in general, as long as they still have users who are engaged via notifications and spontaneous interactions they wouldn’t see at the moment. Spontaneous usage is particularly compelling when you consider the physical web. There’s no way I’m going to install an app for the restaurant I’m in or the airport I’m flying to … but I’ll gladly open a rich web app from a prompt that shows up on my phone.

Remaining concerns

So that’s it huh? The web came back in the third act and triumphed. No. Not even. There are still many challenges ahead.

Challenge: Apple (“There’s a 586 billion dollar elephant in the room and it’s not happy Jan”)

Here are some facts about Apple, which – when combined – lead to skepticism about any efforts to progress the web:

  • Apple and iOS are hugely influential. It’s the platform companies care about most when it comes to development efforts.
  • Browser innovation on iOS is controlled by Apple. Google, Firefox, and Opera may produce their own iOS browsers, but they will still run on Apple’s engine and any progressive web tech on iOS therefore relies entirely on Apple’s whims.
  • Apple’s incentives for browser innovation are “mixed” at best. While it wants a great user experience – including web interaction – it also has a lot to fear from a platform beyond its control that lets developers “write once, run many”.
  • Apple is sticky. Most mainstream users don’t care about detailed OS features like homescreen widgets and notification interfaces. Even the tiny iPhone 4 screens were not enough for most users to look elsewhere. That was at least a very clear and visible advantage for the competition. So how much will users care about a better web experience if they can still get the same apps natively? It’s a moderate OS benefit and may help Android (or other platform users) with retention, potentially clawing back market share in the long term, but it’s unlikely to sway many users away from Apple. It’s too difficult a concept to even explain, let alone for anyone to really care about it if they aren’t already using it.

Don’t hold your breath for an iOS which supports native-level video calling, gaming, and podcasting. Some features will make their way over time, but by then, there will be even more features on both native and – on other platforms – web. The only question worth asking is, how much does it matter?

For me, the answer is “not as much as you might think”. People are still making web apps anyway. The whole thing about progressive web apps is they are progressive, not binary. So your web app can still work quite nicely on iOS, but do even more on other platforms. This won’t apply to all genres, e.g. it’s quite useless to make a voice calling app without push notifications.

Furthermore, there is a whole class of developers with iOS apps but lacking apps on Android, Windows, and other platforms. Enhancing their existing web presence is an increasingly bright alternative to hiring dedicated native developers, considering they usually have a web app and developers already (even more so if they are one of many companies now running Node/JavaScript on the server side). It may not give as perfect an experience as a native app, but it’s infinitely better than doing nothing on those platforms and may well be as good as they would produce anyway outside of iOS.

Challenge: Discoverability (“Websites on a plane”)

It’s still difficult to find good, installable, web apps. There are some hints when you’re already using one – e.g. the prompts to install on home screen or receive push notifications, and the color scheme supported by the web manifest protocol. However, if I want to find an installable app to do X, where do I go? On native, I can just search in the store.

On web, I can search in … Google? Nope. I’ll usually get a pile of ugly and ad-ridden sites that happen to be old enough to have reached high rankings. Thankfully, Google does care more about performance and mobile-friendliness now, but it still doesn’t come close to the app browsing experience of a native app store.

This is exactly what Chrome Web Store should be doing in 2016. I hope Google is working to finally bring the web store to native. (Or, amid much controversy, integrate web apps into Google Play.) And other browsers are similarly working on this problem.

Until then, there are curated showcases.

Challenge: Native keeps moving forward (“2010 called.”)

Native remains a fast-moving target. The web may have caught up on many features of a modern smartphone, but native has moved on to power virtual reality, cars, home appliances. Additionally, there are still many basic functions that aren’t yet possible on the web, though they are being debated and worked on. e.g. I still can’t make a full-fledged offline podcast app because of SSL and cross-domain restrictions. Bluetooth, USB, background audio … these APIs are all being worked on, but aren’t there yet.

Challenge: Payments (“Shut up and take anyone’s money”)

Frictionless payments obviously drive a huge amount of activity in the app world and this is a realm where the web really hasn’t changed since the introduction of smartphones. In-web payments is a complicated 4-way problem – there are users, browsers, operating systems, and payment providers. Add security, UX, and privacy to the mix, and now you see why there are casual games earning millions each day on native platform but nothing on the web.

If this can be cracked in the context of progressive web apps, game-changer.

Challenge: Native app streaming is also progressive

The progressive engagement model is no longer exclusive to the web. Never afraid to boil the ocean, Google has now begun previewing native apps by streaming them from the cloud. Yes, it relegates your device to a “dumb” display unit and runs the app on Google’s servers, at least until you decide to install it. It’s a very different type of progressive engagement, but it may steal some of the web’s progressive thunder nonetheless, especially if Apple was to follow suit.

Conclusion

Progressive web technologies are making it possible to go beyond just rich websites to “real deal” digital assistants like people have become accustomed to with native apps. The install model mirrors the way an app or service builds trust over time, and for this reason, it goes beyond the binary “installed or not” situation for regular native apps. While many challenges remain, the good news is … it’s progressive. Developers can already see the benefits by sprinkling in these technologies to their existing websites and proceed to build on them as browsers and operating systems increase support.

How to show dates to humans

First, how not to show them:

“Hey come to our amazing concert — 5/6!”

Now, how to show them:

“Hey come to our amazing concert — Wednesday May 18, 2016!”

I admit the former is more concise, bt cncs dsnt lwys mn bttr even if you can parse it.

The rules are simple, please do this when you mention a date:

  1. Include the year. There are 80 trillion web pages and most of them were written before a few months ago, so if I see a date without context, I have no evidence it refers to a time in the future. It could be any time in the last 2 decades.
  2. Name the month. Let’s not get involved in a big debate about MMDD versus DDMM versus YYYYMMDDAAAA🙏🙏🙏🙏ZZZZzzzz. When we’re displaying dates to regular users, keep it simple and use a format everyone immediately understands – the month name. Or an abbreviation thereof. I realise that’s not international-friendly, but the date presumably appears with surrounding text, so use the same language for the month and use one of many i18n frameworks to localise it if you have multiple languages. [1]
  3. Name the weekday. Come on, would it kill you to tell me what day this is on as well? That’s a big deciding factor for many people and helps to plan for the event and remember exactly when it happens.
  4. Count it down. Here’s where digital formats can better traditional printed formats. The date display can be dynamic, so you can show a countdown when it’s appropriate. Again, it helps to make the date more meaningful and can also create some excitement around the event.
  5. Add to calendar. In some cases, you might provide support for adding the date to users’ calendars. There’s unfortunately no great standards for this, but there are tools.

Any others?

  1. Credit Daniel for the reminder.

2016 Tech Predictions

Here are my predictions for 2016.

1. Swift everywhere

Swift 2 built on Swift’s popularity and it’s clear this language will fly far beyond the confines of iOS. It’s more open than anyone could have expected from Apple and unambiguously the future as they see it, it interops well with the giant legacy of Objective C components, and developers genuinely dig it without a whiff of reality distortion.

Swift is Apple’s answer to Node – app developers will use it to make their first move from client to server. More than that, it has the promise of a “write once, run many” framework. Sure, these frequently lead to mediocre apps, but a lot of developers already have mediocre apps on the web, Android, and (if they have an app at all) Windows. It’s not hard to see the attraction of a turnkey solution that will crank out half-decent apps for platforms that aren’t running in the CEO’s pocket.

2. Microsoft open sources Edge #MoarSatya

Microsoft recently open sourced Edge’s JavaScript Engine (“Chakra”). Edge is already free as in beer, as with its predecessor – MS Internet Explorer – and pretty much every other web browser, ever. Yet, Edge (and IE) is closed source, in much the same way as every other major browser has fully open-source engines and Firefox + Chrome are essentially all open source. I don’t need to go into a long monologue here about the pros and cons of open source; suffice to say, I believe open sourcing Edge will improve its quality and compliance with emerging web standards.

There is an additional reason to open source Edge: it will help to unlock what must be one of the major strategic goals for Edge – to run on platforms beyond Windows. MS has been busy buying and building apps compatible with iOS, Android, and OSX. Browsers are no longer dumb clients – they sync user settings and data across devices, and MS wants Edge users to remain Edge users when they move away from their PC. Furthermore, much of the web is developed on OSX, and MS will make it easier for developers to build first-class Edge experiences if they can ensure Edge is running there without making developers jump through hoops.

3. Google helps users discover mobile web apps

Google has been pushing the progressive web app mantra for a couple of years now, and there’s a huge amount of problems that can now be solved using a web app. While the “install to home page” prompts are helpful if you’re already using an app, how do you discover the app in the first place? Serendipity only gets you so far. Maybe you can use Google Search? Hardly. You’ll see several native apps first (if searching on Android), followed by some crap web page which has top place because it’s been online since 2003 (before its developer had heard of JavaScript, and it shows). Meanwhile, you have Chrome Web Store exclusively for desktop and Play Store not showing web apps.

It’s clear Google cares about the web and making web apps thrive, and its search business depends on this. So how will it bring the app store experience to the mobile web? I won’t be so bold as to predict web apps in the Play store – the last thing Android team wants is millions of “glorified bookmarks” contaminating the listings. Chrome Store for mobile? Maybe. Better mobile web app search? Very likely. However it happens, I believe it will be a lot easier to find the best “timezone web app”, “calculator web app” etc for your phone by the end of 2016.

4. Netflix produces daily news/entertainment show

Netflix will create a new daily show in the mould of The Daily Show and The Late Show. This would be a big departure from their typical evergreen model, which has certainly been vital in building a diverse catalogue under their full control. But there is good reason to expand in this way – ongoing shows mean users can always log in and expect to see something fresh and topical. No more frustrating moments hunting around for something decent when you’ve finished a multi-season binge. Additionally, they benefit from viral clips circulating with that Netflix watermark. (Also, the distinction between evergreen and topical shows is not entirely cleancut; old talk show interviews can still generate giant numbers on YouTube, while series such as House Of Cards will look aged before long.)

The 2016 election is sure to be a perfect backdrop to launch this onto their captive audience, probably with a companion podcast.

5. Podcasting as an art form

Just as TV has become something of an art form in recent years, we will see podcasting viewed in the same light, and as something with distinct properties from radio. Needless to say, the bingeworthy nature of Serial is a big part of this, but it’s also a result of business serials like Startup and fiction like Night Vale, Limetown, and The Message using the medium to its fullest.

6. All-You-Can-Eat video streaming from Google

Google already has all-you-can-eat music streaming subscriptions and also released YouTube Red as an ad-free version of YouTube with offline capability. A natural next step for Google Play would be all-you-can-eat video. It would be similar to Amazon, which still has premium videos for purchase or rental, but has all-you-can-eat on Prime. Indeed, Amazon is part of the reason Google should be doing this – thus far, they have made it all but impossible to consume Prime videos on Android (it requires sideloading Amazon’s marketplace app, and even then, a lot of videos aren’t compatible). This leaves just Netflix as the only viable all-you-can-eat platform in most markets, and Google therefore stands to bolster Android itself as well as generating revenues from such a service.

[Updated – Bit about browsers being open-source. Stuart pointed out Safari, which certainly qualifies here, isn’t.]

Developer Relations: A Five-Level Maturity Model

Having worked on both sides of developer relations, here are some thoughts about different levels of maturity for developer relations.

LEVEL 0: No developer relations

No internal effort is made to promote the platform, support developers, or capture their feedback.

LEVEL 1: Informal

No official developer relations staff or programme, but some developer relations handled by other functions. PR may be promoting the platform, business development may be partnering with and supporting developers.

LEVEL 2: High-touch

High-touch, often stealthy, relations with prized partners (i.e. large, established, companies or those with sufficient resources to build showcases for new features). This is a “don’t call us, we’ll call you” outreach which may entail the platform providing funding or direct technical capability to build out the integration, and often working with as-yet unannounced technology so it can be launched with a set of poster-child applications.

LEVEL 3: Evangelism

Promoting, explaining, and supporting the platform at scale via conferences, partnerships, and online media. Proactive efforts to recruit large amounts of developers to use the platform.

LEVEL 4: Advocacy

A 2-way relationship in which the platform’s own staff sees themselves as not just advocating for the platform, but as advocating for developers using the platform. With this mindset, developer relations plays an active role in feeding back real-world bugs and feature requests, and building supporting tools to improve the developer experience.

LEVEL 5: Quantified

Metrics-driven approach in which the return-on-investment for developer relations is understood and outreach efforts are able to quantified, both with high-touch partners and at scale.


Now some caveats about this.

First, how not to use this model. Any maturity model immediately makes you think companies should be ascending to the top level, but that is not the case and not the intention here. Ascending comes at a cost that may not be justified; clearly, a pure platform company (e.g. Twilio, Stripe) has a lot more incentive to get to the top than a product company with an experimental “labs” API, for example. There is financial cost, additional risks, and distraction to the rest of the organisation; all that needs to be weighed up. The purpose of this model, then, is to provide useful labels to facilitate these kinds of decisions. Not to imply one is intrinsically better than another.

So the way to actually use this model is simply to be true to yourself. Where are you now and where do you want to be? If you’re happy at level zero, scale any devrel back. If you want to shoot for level 5, start ramping up. Companies often differ widely between official and actual practices. A company may have no official developer relations programme, but instead have a technical marketing team or a super-engaged developer team who perform the same function. Likewise, no amount of fancy business cards will compensate for a developer relations programme that doesn’t develop and rarely relates. Hopefully, this model helps people to understand where they’re at.

Final caveat: Turns out you can’t pigeonhole a complex organisation into a simple number rating. The lines will blur when applying these definitions to $YOUR_FAVORITE_EXAMPLE. You may apply these definitions to a whole company, a single division, or a single product.

(Updated same day: moved maturity levels to top of article)

Bloom Filters and Recommendation Engines

I’ll explain here what Bloom filters are and how you might find them useful for recommendation engines. I haven’t used them yet in production — just sharing what I’ve been learning.

Why Bloom Filters?

I was thinking about recommendation system algorithm. Not the main algorithm: how do you generate good recommendations. But instead, an important “side algorithm”: how do you keep track of recommendations users have previously dismissed? All the genius recommendations in the world aren’t going to matter if you keep showing the same results.

The most obvious solution here would be to track everything. Simply store a new record for every “dismissal” the user makes. But that’s a lot to store in a high-scale system, e.g. if 10 million users dismissed 20 items each, you have 200 million records to store and index.

So this is where Bloom filters come in as a highly compressed way to store a set of values. The catch is: it’s fuzzy. It’s not really storing the set; instead, it’s letting you ask the question you want to ask, which is: “Is X in this set?” and coming back with a probabilistic answer. But that’s okay for something like a recommendation system.

Here’s an example. A user Jane has dismissed three articles, identified with their IDs: 123, 456, 789.

Under the traditional model, we perform a standard set inclusion check (e.g. check if a database row exists) and come out with a definite answer:

Jane:
  123
  456
  789

Q: Is article 888 in the "Jane" set? Algorithm: Check if 888 is in [123, 456, 789] A: No. I'm sure about that.

Under the fuzzy Bloom filter model, we end up with some funny value as a fuzzy representation of the whole set, and then we can get a probabilistic answer about set inclusion.

Jane:
  01101001 (this is the Bloom filter)

Q: Is article 888 in the "Jane" set? Algorithm: Check against the Bloom filter (details below) A: Probably not. But maybe. About 5% likelihood it's in the set.

Deriving the Bloom filter

So in the previous example, how did we end up with that representation of the set (what I playfully refer to as 01101001). And what did we do with it?

It’s fairly simple. Remember, this is the only thing we store and the set builds up over time. So the Bloom filter starts out as empty and each new set member adds something to it.

The real representation is a bitwise vector, let’s go with 8 bits: 00000000

So when user Jane is created, her Bloom filter is 0000000.

Jane dismisses article 123. Now what we do is, we compute some hashes of 123 using different algorithms. Since we have decided to make our Bloom filter 8 bits, each hash algorithm should give a number between 0 and 256, so we can store the result. Let’s assume we use two hash algorithms to hash 123. One ends up with 64 (01000000) and the other with 33 (00100001). So now our Bloom filter is:

01100001

When we get a 1, we set the bit to 1. When we get a zero, we do nothing. So yes, over time, this will fill with 1s. That’s why we have to choose a big enough bloom filter size.

Going on, the next dismissal is 456. And maybe we end up with hash values 01001001 and 0110000. So the first of these has added a new “1” to our previous value of 011000001:

01101001

And finally, we might end up with 01001000 and 00100000 for ID 789, neither of which light up any new bits. So we still have the same Bloom filter as before.

01101001

Is X in the set?

Now we have Jane’s Bloom filter, 01101001. This is a fuzzy representation of [123, 456, 789]. We can then ask, for any given value, is it in the set?

e.g. if our recommendation algorithm comes up with 888, should we show it to Jane. We don’t want to show it if it is in that set of previous dismissals. We compute using the same hash algorithms as before and perhaps we end up with 00101100. It lit up a different bit (the 6th one), so we can say categorically, it’s not in the set. We know that for sure because if it was in the set, all those bits would be on. We know for sure it’s not in the set of dismissals, so we can confidently recommend it to Jane.

Take another recommendation we might end up with – 456. Do we show it to Jane? Well, is it in the set of previous dismissals? We compute and get 01101001. It fits within our Bloom filter, so there’s a good chance it was in the list of values that was used to build up the filter. But no guarantee. We might end up with a value of 00001000 for another ID, e.g. 555. This would also fit the Bloom filter and we can be no more certain that it was in the original set than we can be for the 456 value. So, it’s probabilistic. You can be certain some things aren’t in the set, but you can’t be certain something was in the set. For a recommendation of 456 or 555, we can’t be sure. So in either case, we will not show Jane the recommendation and look deeper for more certain values.

Fine tuning

The example above just magically decided to use a Bloom filter of 8 and hand-waved around the algorithms. In practice, you will need to decide on those things and in practice it will probably be hundreds or thousands of bits; otherwise, every bit will quickly fill up to become 1. A cool thing is that there are precise calculations that can help you estimate exactly how big the Bloom filter should be, based on the expected number of items in it, combined with your tolerance for error. (If your algorithm can easily generate lots of good recommendations, you could have quite a high tolerance because it would be easy to skip over any potential matches.)

References

Considering the recommendation problem made me recall this article about how Medium uses Bloom filters and also led me to a useful tutorial on the topic.

Sidekiq 4’s performance boost

Mike Perham managed to turbo-boost Sidekiq for v4, making it six times faster. This in itself is good news for those of us who use it and his write-up is also of interest. #perfmatters

The perf tricks that made this possible:

  1. Redis -> worker communication (dispatching new jobs to work on): Instead of a single, global, thread on the client taking requests from Redis and locally dispatching them, every worker now gets its own direct line to Redis.
  2. Worker -> Redis communication (reporting when a job is complete): Instead of workers constantly updating the server, there’s now a client-side proxy that updates it in batches every few seconds, ie it buffers up the pending updates and periodically sends them in a multiplexed message.
  3. Refactored to do direct thread manipulation instead of relying on Celluloid.

Very interesting that (1) and (2) are almost the inverse of each other. Redis → worker job assignment has been switched from a global model to a per-worker model, while Worker → Redis job completion reporting has been switched from a per-worker model to a global model. So that’s the time-honoured pendulum swing between centralisation and decentralisation, in a nutshell.

Also, as a commenter notes, it’s not obvious how much has been gained by the withdrawal of Celluloid. Removing a library can not only increase complexity, but can be counter-productive to performance if the library captures years of performance boosts you’ll otherwise have to learn yourself. Nevertheless, in the case of Celluloid, it was really there to simplify the multithreading programming effort, and given how important this is to Sidekiq, it’s the kind of thing that often makes sense to take full control of. (The dubious refactorings are those where some peripheral feature like logging just had to be home-made. In the case of mission-critical functionality, there’s often a lot to be said for DYI.)

When your app composes tweets: Dealing with metadata

For those who don’t know, Twitter converts every URL to its own “t.co” shortener URL. So no matter how short or long your original URL is, the t.co URL will end up as a fixed character length, and that character length does count towards the 140 limit.

Any sane Twitter client will hide this complexity from end-users. The word count algorithm will be smart enough to take this into account show the remaining characters.

But as a coder, you need to incorporate that logic yourself.

You should also know that Twitter’s API won’t automatically truncate a tweet, so if your app tries to send a long one, Twitter will return an error. So your tweet-posting app will need to truncate the tweet to 140 characters.

So I was coding up an auto-tweet setting, which requires you to estimate the length of a tweet. The code looks like:

  1. TWEET_LENGTH = 140
  2. TWITTER_URL_LENGTH = 19 // !!Danger - read on!!
  3.  
  4. def compose_message(episode)
  5.    hashtag = '#nowplaying'
  6.    url_and_hashtag_suffix = " #{episode.url} #nowplaying"
  7.    max_title_length = TWEET_LENGTH - (1 + TWITTER_URL_LENGTH + 1 + hashtag.length)
  8.    "#{truncate episode.title, max_title_length}#{url_and_hashtag_suffix}"
  9. end

And then, with a long title, it failed. Can you guess why?

The answer is because I apparently went to sleep for three years, and when I woke up, the world had composed hundreds of billions of tweets. Many of them include URLs, which means the t.co length has crept up to 22 characters – 23 for SSL URLs – rising at about 1 character a year. Yes, if your tweet has a link in it, you now have to be 2.5% more concise in describing the link (that’s 3/(140 – 19)).

Thankfully, there’s an API for this:

So your code could periodically crawl the config API and aggressively-cache the result. Or alternatively, have your build script download it to your code base at compile-time, if it hasn’t seen an update for a while.

I haven’t checked in detail, but there are probably some open-source Twitter packages (gems, NPM modules etc) that include this config data and keep it up to date.

Note this also affects images and video – the above config URL also provides the length of a media item.

The Class1-Class2 Naming Antipattern for Associations

Join classes (aka join tables) are entities whose main purpose is to associate one object (aka record) with another in a NxN relationship.

A common and decades-old pattern, which is almost always wrong, is to name these classes after both association classes.

Examples of NxN relationships:

  • PersonStock maps owners to their stock.
  • UserFeed maps users to feeds they are subscribed to.
  • StudentCourse maps students to their courses.

What’s wrong with these names? First, they are awkward to say and cumbersome to deal with in code (as a general rule, multi-word entities are best avoided, because it becomes confusing and ambiguous when they are combined with other words). Second, they are redundant to anyone who is looking at the class’s foreign keys (admittedly, some redundancy is okay if it makes the code more understandable, but one should always be weary of a naming scheme which could be auto-generated by a trivial script). Third, and the biggest complaint: it’s wholly unnatural to anyone versed in the domain, therefore not a good model of reality. Only programmers and DBA use terminology like this; domain specialists do not.

The fundamental problem is it frames these associations as being entirely about the things they associate, instead of treating the association as a first-class citizen, which is inevitably how they are treated by a practitioner in the field you’re modelling. Once you start seeing the association as a model in its own right, you can start to enrich it with meaningful properties and behaviours. And this is typically true in the real world – associations are more than just dumb pairings of item A and item B.

More than just modelling these associations and finding an appropriate name, it can also prompt you to talk with domain specialists about what actually are the NxN join concepts in this domain.

Revisiting these examples:

  • PersonStock is better modelled as Ownership. Now that we have a concept of “ownership”, we can think about things like when was it created (ownership.created_at) and what kinds of conditions must be required to create an “ownership”. You could do this kind of reasoning with a “OwnerStock” thingy, but it’s more mental gymnastics and takes you a step away from domain specialists.

  • UserFeed is better modelled as Subscription. Now we can attach properties of the subscription, e.g. a ranking/rating indicating how much the user loves any particular feed. This data may then be used to determine how the user is notified of updates and perhaps how the “river of news” is sorted. Or maybe a visibility attribute indicating who can see the subscription, ie is it public that a given user is subscribed to a given feed.

  • StudentCourse is better modelled as Enrolment. Now we can record a “passed” or “grade” attribute against the enrolment and consider pre-conditions for creating an Enrolment, such as looking at the user’s past Enrolments.

<

p>Not all associations have a natural word to describe them, but even when they don’t, it’s worth thinking really hard about coming up with a new term. The Class1-Class2 name is almost always the road to pain.

A simple way to speed up Vim Ctrl-P plugin: Delegate to Ag

Ctrl-p is “Intellisense for Vim”, allowing you to quickly jump to a file by searching for a few letters or even fancy camel-case type searches. (e.g. find article_editor.rb by searching for “ae”).

However, doing all this requires it to maintain a search index, aka cache, to be maintained. That can be very frustrating with a big project as it takes 5-10 seconds to update, which is not a good thing when you’re desperately trying to jump around files. This delay would be fine if Ctrl-P worked in the background, but due to Vim limitations, it can’t, so you have to frequently run it on the command-line and wait for the update.

Or do you?

No you don’t. Here is a trick that lets you never wait for ctrl-p again! Just add this to your vimrc:

let g:ctrlp_user_command = 'ag %s -i --nocolor --nogroup --hidden
      \ --ignore .git
      \ --ignore .svn
      \ --ignore .hg
      \ --ignore .DS_Store
      \ --ignore "**/*.pyc"
      \ -g ""'

It’s taken straight from here. The cool thing about this trick is it doesn’t just speed up indexing, it completely removes the need for it. This is achieved by relying on the command-line tool Ag, aka Silver Searcher. It’s a brilliant grep replacement I would recommend to anyone, being exponentially faster than grep (as in, you can happily search a whole hard drive in real time).

I’ve used Ag for years but never realised it could be piped into Ctrl-P!

That page also includes some matching optimisation, but seriously the Ag trick was all I needed. Searching is now completely instantaneous and I never need to worry about the index going stale again.

The update has been pushed to my dotfiles.