Podcasts ascend in cultural influence
Four-year cycles - whether elections, Olympics, or World Cup - have long been a revealing way to observe changes in technology. (Fun fact: four years represents a more than a tripling in tech capability, if you take Moore’s Law as a generality, i.e. log(4)/log(1.5) ~= 3.4 ). In this election cycle, podcasts were a hot topic, with some dubbing it “The Podcast Election”.
Both candidates went on presidential podcast tours alongside traditional rallies. Trump, in particular, appeared on at least fourteen podcasts, including some of the most popular podcasters across all genres. His VP, JD Vance, appeared on even more podcasts than Trump.
Just as telling was the pressure and intrigue around a Joe Rogan appearance. Joe Rogan Experience is by no means a conventional political show, but is easily the most popular podcast in terms of audience and cultural influence. After much drama, Trump appeared weeks before the election, quickly followed by Vance, and leading to even more speculation around a Harris appearance that never happened and was blamed by some for losing the election (or at least, as an indicator that the winning candidate in future would need to be the kind of person who is comfortable appearing in this kind of space).
The election also shows the limits of politicians generating their own podcast. Their external appearances were more decisive, and that’s because those podcasts have spent years building up loyal followings. Anecdotally, politicians around the world have taken note and are gearing up in their relations with podcasters.
In other podcasting news, Lex Fridman demonstrated the possibilities of AI translation when he interviewed Argentina’s president. Fridman recorded the interview with him speaking English and with President Milei speaking Spanish; his podcast was then able to publish all-English and all-Spanish versions, both in their own voices.
In coming years, we can expect to see all forms of media available in any language. This is an example of innovation that (a) boosts quality - since it’s possible to keep original voices and match face movements in video - while (b) smashing costs at the same time since it’s completely automated. The implications are dramatic as anyone with talent no longer needs to speak a major language in order to be discovered by the global community. This will lead to a further flattening of the world and winner-takes-all dynamic across media. It may also help with language preservation as it takes away some of the major motivations for people to learn a second language.
On the theme of podcasts and influence, there was also a new podcast designed to promote an activist investor campaign, so that’s a first (as far as I know). The prominent activist investor Elliot launched the StrongerSouthwest podcast as part of an activist campaign against Southwest, It’s already gone, though, so I guess it was either a misfire or an overnight success?
On a personal note, I predicted “2008 will be the Podcast Election” in ye olde podcast FAQ. That probably says more about my techno-optimistic bias than anything meaningful about podcasts or politics.
Nuclear energy glows up
Nuclear fission energy is enjoying a renaissance on multiple fronts.
Institutions concerned with climate change have, to some degree, warmed to nuclear energy. The European Union, for example, finally included nuclear as part of its sustainable activities taxonomy (albeit “transitional” state for now). Anecdotally, there appears to be a generation gap among environmentalists, with older members sticking to the long-held party lines while newer members seeing the compromise as valid. While renewable energy tech has improved immeasurably, there’s still a big gap between tomorrow’s potential and today’s immediate needs.
Sanctions and sentiment have caused many nations to become more open-minded about energy solutions. Oil was an exception made as part of sanctions against Russia and drives the economies of other countries with frozen relations.
Then there’s AI and the every-growing needs of cloud providers. This was the marked change in 2024. Microsoft, Amazon, Google, and Meta all have big plans to fuel their data centres this way, which will validate the technology in the eyes of governments and businesses. These companies have driven entire revolutions when it comes to software, hardware, and network infrastructure, meaning we’re likely to see their brightest engineers breathe new life into nuclear, an industry that stagnated in the west for decades.
AI speech captures attention
The biggest AI story of 2024 was equal parts drama and substance, namely The Her Incident. In May, ChatGPT announced its upgraded voice assistant with an impressive set of demos. While ChatGPT already had a voice assistant for premium users, most people hadn’t noticed because (a) they’re not paying (the source of many falsehoods about ChatGPT’s capabilities) (b) it was not very good, particularly because it had a habit of jumping in with a pile of verbiage after the user makes the slightest pause, making it the world’s most superintelligent interrupter. Even founder/CEO Sam Altman was rubbishing it in his All-In interview, which as it turns out was just a few days before they would announce the voice upgrade.
I’ll get the drama out of the way. The voice in the demos sounded suspiciously like Scarlett Johansson, the star of Her, the 2013 movie that envisioned the near-future reality of a lonely dude forming a romantic relationship with his intelligent voice assistant. Turned out she had been approached by OpenAI prior to this to license her voice for training, and declined. She subsequently expressed dismay and threatened a lawsuit as OpenAI pulled the Johansson-like voice option (“Sky”).
As for the voice feature itself, remarkable. It’s getting tantalisingly close to speaking with a real human assistant. It’s fast. It gets straight to the point instead of waffling on like a long-form text-based chat session. It’s not perfect, but you can easily interrupt it and say something else if you feel it’s missed the point or is waffling on.
While it might take a little while longer to match the conversation skills of an average human, one can’t lose sight that it already holds great advantages over any human, i.e., vast built-in knowledge on any subject, ability to look things up online and make sense of them in a fraction of a second, and always being present. Furthermore, many people just find it easier to work with voice than reading and typing - there’s a reason text-based social media like Twitter and Reddit never captured the masses like Instagram or TikTok. Even those who find comfort in text nevertheless find themselves equipped with the ability to multitask using their voice at the same time. Or use it while walking around or doing chores, just like people heretofore listen to music and podcasts.
For these reasons, I believe we’re close to the tipping point envisioned in Her. The core functionality is already there; it’s just a matter of hardware integration to make communication seamless and always-available instead of having to pull out your phone and launch into a specific screen within a specific app.
Machines learns to reason
Voice was far from the only OpenAI upgrade this year. OpenAI also launched o1, their first model to be explicitly based on Chain Of Thought reasoning (CoT). Anthropic and Google likewise enhanced the reasoning capabilities of their models.
To understand CoT, it’s important to be aware that a basic LLM works by producing one “token” (~ word) at a time, being guided by what has been composed up to that point. This is append-only, which is to say, there is no backtrack or explicit planning capability. It’s remarkable how well this has worked in practice. but still leaves opportunity for advanced strategies that would be closer to the way humans reason.
In the simplest case, prompt engineering can be used, like appending an instruction with “think about this problem, outline the steps involved first, and address it step by step”. This can encourage the LLM to break down its response into sub-tasks and ensure each sub-task is more aware of what subsequent sub-tasks will need of it. Nevertheless, this is still a sequential activity.
Chain-of-Thought goes a step further by baking reasoning directly into the model. With GPT 4-o, OpenAI refined GPT-4 by training it on examples of reasoning, where a complex task is broken into sub-tasks. Thus the LLM becomes inherently more capable at breaking any complex task into sub-tasks.
A side benefit of this “divide-and-conquer” logic is a degree of transparency, where users can get some insight into the LLM’s inner workings. Indeed, ChatGPT updated its user-interface for 4-o to explicitly show steps as they’re being worked on before any response is produced. During this preparation phase, it conceptually has the ability to cycle through steps, go back and forth, before it finally begins outputting.
In some cases, the LLM might “think” for 30 seconds or more, while for simpler tasks, it can immediately output an answer just like earlier models did. A future enhancement which will speed thing up is parallel processing, where the LLM can determine in advance that certain sub-tasks may be computed independently of each other, thus running them in parallel. Since a single human cannot multitask, this is another example where LLMs will be able to surpass human capability.
Consumer robotics becoming a reality
After centuries of mythology and science-fiction - and decades of actual research - artificial humanoids are now becoming a reality. LLMs have been the most obvious beneficiary of AI research recently, but robotics stands to benefit from many of the same underlying technologies at both a software and a hardware level. In the same way LLMs benefitted from years of spend from video game enthusiasts, the nascent field of humanoid robotics is benefitting indirectly from spend on LLMs due to the value they are already delivering their customers today.
China has long been investing heavily to automate manufacturing, thereby hoping to keep its lead even as the industry is disrupted by robotics, which could make it viable for some countries to promote a resurgence of local factories. Humanoid robotics is likely to form part of China’s strategy in this effort, and Chinese car maker BYD is also operating Walker robots on its assembly line.
In the US, robotics startup Figure AI raised at a $2.6 billion valuation and announced Figure 02 in August. Figure robots worked on BMW’s assembly line as part of a trial this year.
These industrial applications - where the robot’s environment is controlled and limited to repetitive tasks - should help to build confidence, revenue, and capability to expand into domestic applications over the next decade.
Tesla is one company aiming to provide home robots and appears to be making good progress on its Optimus line. No-one would ever bet on Musk’s timelines, but for what it’s worth, the company claims it will be using 1000 robots internally next year. Tesla is a recent entrant and may not yet have caught up, but has among the deepest pockets and Musk has a track record of taking mere ideas from ideas to one. He also now finds himself with significant political power that will be helpful in attracting funding and navigating regulation.
AlphaFold enriches the foundation of bio-engineering
In 2024, DeepMind’s Demis Hassabis and John Jumper were awarded Nobel prizes for the contribution to solving the “folding problem”. This is a decades-old effort to figure out how the shape of a protein molecule will turn out, based solely on the sequence of amino acids that constitute its raw ingredients. Being able to make such predictions is far more than an intellectual exercise, as it unlocks the ability to generate therapies through software models, which is much faster and easier to automate than conventional lab-based techniques.
As with LLMs, AlphaFold achieves this by training on verified pairings of inputs and outputs. In this case, the inputs are amino acids and the outputs are 3-dimensional shapes. As was mentioned with respect to robotics, this is another area where the technology is converging across domains, leading to economies of scale that make all of those domains more productive.
While the Nobel was awarded for earlier work, DeepMind achieved another milestone this year in launching AlphaFold 3. This is the first major version to expand its scope beyond protein molecules, into predicting shape of DNA and other small molecules. It it also more “automated” insofar as it doesn’t rely on certain knowledge and rules of physics and chemistry, i.e., it is more capable of figuring out these things on its own just by pattern-matching. This mirrors the broader trend in AI away from the pile of “hard-coded if-then statements” to “it just works” if you throw enough data and compute at the problem.
There’s increasing interest in AI drug discovery, with one drug already in Phase 2 testing, and one of the enabling technologies is software that models how the body will respond to such interventions. Hence the relevance of AlphaFold. Its applications also go beyond therapies to industrial design such as new forms of plastics.
Social media fragments
Social media continued to fragment in 2024. The latest wave of X departures came after the US election, in which its owner-CEO completed his transition from self-described politically neutral to giving full-throated support to one candidate. In my experience, the algorithm has become unbearable under Musk’s management, not necessarily because of any political bias, but because everything is heavily off-topic from my interests and because I’ve lost any meaningful connection to my contacts. Add potential bias to the mix and it’s easy to see why people continue to abandon the platform.
Surprisingly, it was BlueSky that picked up most of the followers rather than Meta’s Threads. Ever since 2022 - when Musk announced his takeover - there has been a never-ending cycle of communities moving from one place to another and back again. I wouldn’t confidently predict BlueSky will still be the big winner in 2025, but it would certainly be pleasing to see it reach critical mass, since there’s a world of potential in its open standards and ability for anyone to create a feed algorithm.
The problem for all these platforms is fragmentation. Just as social media apps can rise quickly when they have a large user base, so can they collapse quickly as their user base dwindles. For this reason, it’s not surprising that apps are focused on curating the best content from across the whole network instead of worrying about the follower network. It makes them more attractive to new users who don’t have any contacts present or who have lost all their content to another platform. Nevertheless, it also removes one of the most powerful aspects of these apps in the first place. As Mark Zuckerberg mentions in just about every interview, his company is all about social connections, which makes it ironic to see an app like Threads focus so much of users’ attention on the celebrities of Instagram instead of those they follow.
The biggest beneficiary may be a discussion app that never pretended to be about social connections. Reddit has always been about topics first, posts and comments second, user reputations a mere afterthought. After disappearing third-party apps in 2023, the site went public in 2024, making it the highest-profile tech IPO. The stock has performed exceptionally even taking to account the bull market backdrop, with a stock price gain of over 200%.
Reddit operates in a similar space as X and its clones, all of them being geeky and fundamentally text-based platforms designed for discussions, debates, rants rather than the shallow entertainment scrollfests of Facebook, Instagram, TikTok. In a media landscape that is increasingly full of bots talking to other bots, Reddit’s topic-centric moderation model puts it in a good place as both a user experience and a training ground to sell on to AI platforms.
Smart Glasses Focus the Road Ahead for Extended Reality
AI may have taken the limelight away from headsets in recent years, but the Company Now Known As Meta continued to dive in the metaverse with a range of announcements concerning Extended Reality (XR).
In this space, augmented reality (AR) has taken priority over virtual reality (VR). It seems that manufacturers have concluded VR is “cool” for games and demos, but it’s too isolating, too binary, insofar as it requires the user to don the headset and enter a separate mode of reality. You’re either in VR or you’re not. That’s similar to the experience of sitting down to use a desktop PC. In contrast, AR is more like a smartphone, and smartphones overtook PCs because you can take them anywhere and use them for tasks that are as short or as long as you wish. They’re integrated into your life rather than splitting your time into IRL and Second Life.
With Oculus and Apple Vision Pro (another new product of 2024), their version of Augmented Reality is more accurately called Mixed Reality (MR). Cameras mounted on the headset give you the illusion you’re interacting with the real world while you’re in fact entirely locked inside the headset. From personal experience, both devices are getting impressively close to reality, but they still suffer from the headset form factor. You’re either in it or you’re not, and few people will dare to step in public with a $3000 machine strapped to their head.
All of this makes Meta’s Orion Glasses the standout XR event of 2024. These glasses are very close to a conventional pair of specs, but true augmented reality via waveguides. The way this works is by light project directly on your retina via the glass, rather than “lighting up pixels” on the glass the way a conventional monitor or VR headset would, and more or less how Google Glass worked. Waveguides promise to deliver higher quality display while making the real world look more like the real world.
Orion is still in early prototype stage, but reports from few external reviewers are optimistic, albeit on a tight set of use cases. Verge reported a game of Pong worked “surprisingly smoothly, and I noticed little to no lag in the game”. For an external Messenger call, “I could see and hear him well in the 2D window floating in front of me” . Meanwhile, Meta is pushing forward with its RayBan partnership - smart glasses with camera AI audio - which will gradually help it build up expertise in many other necessary technologies outside of the key Waveguide functionality. They appear to have avoided the stigma of Glass, and as I mentioned in my grading of 2023 predictions, they are going out of their way to force this issue early on by releasing a variant that is transparent and therefore shows the camera proudly to anyone who care. Public acceptance is vital to taking such a device mainstream and eventually replacing the smartphone.
Silicon Valley 🤝 Washington
I mostly keep this blog politics-free, but can’t avoid observing the relevance of the US election to the technology sector. The next ivipresident’s inner circle now includes some of the most prominent venture capitalists and entrepreneurs. The next vice-president is a former venture capitalist who ascended to politics with the help of a prominent investor and is steeped in Silicon Valley culture. All of this especially matters because the president himself has shown little interest in technology, meaning his VP and advisors will be largely running the agenda.
Dominic Cummings recently remarked that Silicon Valley, for a long time enjoyed keeping their distance and not having to worry about the political class in Washington who were largely ignorant towards technology. He noted that this was no longer feasible when technology became both a major influence of political power via social media and a matter of national security as with LLMs. He also argued the EU has stifled innovation through its privacy (GDPR) and AI regulation (AI Act), something he notes has been echoed even by EU-friendly actors such as President Macron.
The tech folks in the White House are likely to block and repeal most efforts towards regulation, wanting to avoid the fate of the EU and stay competitive with China. While Vance has expressed some admiration for Lina Khan’s clamping down on monopolistic practices, I don’t think this will extend to the FTC’s stance on acquisitions since there’s a view VCs and founders are being denied of an exit. Thus it seems likely the next four years will see a number of high-profile companies - private unicorns and some public small/mid caps - being acquired.
There’s also concern about lagging military technology and admiration for the more modern example set by DefenseTech startups such as Anduril, which focuses on fast-moving software-first approach and billing against specs rather than hourly labour, making its work arguably more in line with the needs of its clients in the defense industry. There’s likely to be increased initiatives aiming to foster this industry.