February 29, 2024

The Border! Rockets! Words Become Videos!

Episode Audio

This image has an empty alt attribute; its file name is image.png

Recorded 2/16/2024.

Episode Notes

The episode opens with casual chatter about a possible Starship launch while the hosts are near Brownsville, followed by a long discussion of major AI announcements from Google and OpenAI. Andrew explains Gemini 1.5's large context window claims and the limits of simply increasing tokens, while the group shifts into a detailed examination of Sora's text-to-video results, including improved realism, physics, camera motion, and remaining failures. They repeatedly return to the role of compute, model scaling, and the economics of AI progress.

The latter half of the episode focuses on how AI tools may change creative work and how people should adapt by learning to code, prompt, and experiment rather than resist. The hosts also discuss ChatGPT memory, personalized interactions, and Andrew's experiment turning Sora footage into 3D video for Vision Pro. Near the end, Justin gives a Suno Valentine's Day song generator pick, Brian recommends Star Trek: Lower Decks, and the panel briefly covers Rick and Morty, Dune 2, Madame Web, and speculation that the new Fantastic Four film is set in the 1960s.

Key topics

Token counts and what context windows enable: Andrew explains tokens, context windows, and why Gemini 1.5's claimed 10 million-token context is significant for reading large documents and answering questions across them. He cautions that retrieval is easier than true reasoning over long inputs.
Compute cost as the limiting factor in large-model features: The hosts repeatedly note that larger context windows and better AI outputs require more compute, which makes them expensive. Andrew frames cost and infrastructure as the key constraint on how far these systems can be pushed.
How Sora's video generation improves realism: The discussion highlights Sora's improved video quality, with examples involving physics, shadows, reflections, camera movement, and 3D consistency. Andrew explains that scaling compute and training choices like using full-resolution source images helped drive the jump.
Residual failure modes in AI video: Even while praising Sora, the hosts note that some outputs still fail, such as odd behavior around a plastic chair. This keeps the evaluation grounded and cautious.
Physics, camera language, and realism in generated video: Justin and Brian emphasize that the videos feel convincing because they mimic drone footage, GoPro footage, and natural camera placement, not just object motion. Andrew stresses that the system appears to understand cause and effect in a limited but important way.
Compute as the bottleneck for AI progress: Andrew ties progress in video generation to large increases in compute and says the demand for compute will continue to grow. He links this to the value of chips and the economics of AI infrastructure.
Open source vs. guarded commercial models: Andrew says that major labs may lead on general-purpose models, while specialized open-source systems could emerge for narrow use cases like car chases or diner scenes. He also notes that guarded products can be less useful despite being strong technically.
Education and skill-building in an AI era: The speakers argue that learning to code, learning to prompt, and making time to experiment with new tools are valuable long-term skills. They compare this to learning the internet early and say the ability to direct machines matters.
AI memory and personalization: The hosts discuss ChatGPT memory, custom voices, remembered facts across chats, and increasingly personalized interactions. Brian also argues that patient LLMs may help people stay engaged and keep learning.
Stereoscopic conversion of AI-generated video: Andrew describes a workflow for taking Sora clips with sideways camera motion, offsetting them into left-eye and right-eye views, and packaging them for Vision Pro or Quest as 3D video.
Easy personalized song creation with Suno: Justin recommends a Valentine's Day web form at v-day.suno.ai that creates personalized songs based on a name and memory, producing multiple genre versions for sharing.
Current sci-fi and genre media recommendations: The conversation includes positive reactions to Lower Decks, Strange New Worlds, Rick and Morty, and Dune 2, along with strong dislike for Madame Web. These are discussed as current viewing picks or reactions.
Fantastic Four casting and setting speculation: The hosts read a Fantastic Four Valentine's Day card as evidence that the film may be a 1960s period piece, and they discuss the cast and Herbie the robot with enthusiasm.

Picks

Justin Robert Young: V-day.suno.ai — Justin explicitly presents this as his pick and recommends it as a simple way to generate personalized Valentine's songs from a name and shared memory.
Brian Brushwood: Star Trek: Lower Decks — Brian explicitly gives it as an old pick that is new again and praises the latest season as really good.
Andrew Mayne: Rick and Morty — Andrew says he finished the last season and thought it was one of the show's better seasons, despite a weak first episode.