ByteDance's new AI turns photos into movie sequences; Mistral launches a content moderation API; Walt Disney forms new AI business unit; Saudi to invest $100b in AI through "Project Transcendence"
UK launches business-oriented AI safety platform; VCs warn AI startups to prioritize revenue in 2025; there's a strong community of women building AI startups; AI chatbots are the new priests
TikTok’s parent company ByteDance has just previewed X-Portrait 2, a new portrait animation model built in cooperation with researchers from Tsinghua University, China’s best-ranked university in science and technology.
The model is an evolution of the first-generation X-Portrait technology (which was presented earlier this year at SIGGRAPH) and provides similar functionality to Runway’s Act-One tool, though the comparison with the latter is not entirely fair as Act-One generates a full video sequence, not just AI avatars.
X-Portrait 2 works by taking two inputs: a static portrait image and a driving video. It uses the facial expressions and speech patterns of the human present in the driving video to animate the other human in the static image, generating expressive and realistic videos as an output.
What’s remarkable about X-Portrait 2 is how it is able to capture even the more subtle facial expressions from the inputs and preserve them in the outputs, thanks to improvements in its expression encoder which has been pre-trained on a large dataset likely sourced from user generated content on TikTok or Douyin.
X-Portrait 2 relies on a generative diffusion model to accurately render the facial expressions and head movements in the output video, including pouting, tougue-out, cheek-puffing and frowning. Diffusion models have become the norm in image and video generation over the last 12-24 months because they offer greater control, flexibility, and are less susceptible to model collapse compared to GANs.
The other noticeable innovation of X-Portrait 2 compared to its predecessor is the ability to disentangle appearance and motion, allowing the new model to focus solely on facial expressions and then faithfully transfer those expressions across different filmmaking techniques and styles, for example from live-action shots to animation clips. Therefore, the researchers argue that X-Portrait 2 could be deployed as a fast iteration tool in the video production process of Hollywood-quality animations or movies, saving valuable time and resources for visual effects teams.
X-Portrait 2 is one of several portrait animation solutions I’ve seen trending on social media over the last month. It feels like many of the companies that launched general purpose video generators earlier this year (Kuaishou’s Kling AI, MiniMax’s Video-01, Hailuo AI, Luma AI’s Dream Machine, Pika or Runway’s Gen-3 Alpha) are now looking to optimize their technology for more specific use cases such as AI avatars because they sense there could be a market for this technology among professional-quality video creators.
Something to keep in mind is that these image-to-video approaches don't fully generate lip-sync performances from a synthetic voice in a way that platforms such as Synthesia do, since they rely on an existing, expressive video to drive the character’s facial movements. And, as with other general purpose video models, there are open questions about how well they generalize in real scenarios since the results presented are always cherry picked to illustrate the best outputs.
It’s also noteworthy that more than half of the companies listed above come from China, which is culturally more open toward digital humans and also has a thriving animation and interactive entertainment community that produces high-quality user-generated content optimized for social sharing.
And now, here are this week’s news:
❤️Computer loves
Our top news picks for the week - your essential reading from the world of AI
VentureBeat: Mistral AI takes on OpenAI with new moderation API, tackling harmful content in 11 languages
Bloomberg: AI Will Transform Medicine. There's Just One Catch
FT: UK government launches new AI safety platform for businesses
Bloomberg: Saudis Plan $100 Billion AI Powerhouse to Rival UAE’s Tech Hub
TechCrunch: AI startups will need ‘quality of revenue’ to raise in 2025, seed VCs warn
Business Insider: For women building AI startups, finding community has been the key to success
Washington Post: A deepfake showed MLK Jr. backing Trump. His daughter calls it ‘vile.’
Axios: What AI knows about you
The Information: The Generative AI Spending of 50 Companies, From Coke to Walmart
New York Times: An ‘Interview’ With a Dead Luminary Exposes the Pitfalls of A.I.
WSJ: The Giant Supercomputer Built to Transform an Entire Country—and Paid For by Ozempic
Business Insider: AI chatbots are the new priests
FT: AI’s huge power needs give oil majors incentive to invest in renewables, says Adnoc boss
Reuters: Walt Disney forms business unit to coordinate use of AI, augmented reality
⚙️Computer does
AI in the wild: how artificial intelligence is used across industry, from the internet, social media, and retail to transportation, healthcare, banking, and more
The Verge: Google could add AI replies to its handy call-screening feature
The Verge: Microsoft Outlook now has dynamic AI-powered themes
FT: The doctors pioneering the use of AI to improve outcomes for patients
BBC: Instagram to use AI to catch teenagers who lie about their age
ZDNet: Android smartwatches can now transcribe and summarize your voice notes, thanks to AI
Bain & Company: Five Functions Where AI Is Already Delivering
The Verge: Prime Video will let you summon AI to recap what you’re watching
Reuters: UAE'S ADNOC to deploy autonomous AI in the energy sector for the first time
The Guardian: Dutch publisher to use AI to translate ‘limited number of books’ into English
Fortune: Apple darling Goodnotes expands to provide AI tools for ‘less technically inclined teachers’
Time: How AI Is Being Used to Respond to Natural Disasters in Cities
🧑🎓Computer learns
Interesting trends and developments from various AI fields, companies and people
TechCrunch: AI-powered parenting is here and a16z is ready to back it
WSJ: AI Saves Ad Agencies a Lot of Time. Should They Still Charge by the Hour?
Engadget: Google's Vids AI video maker is rolling out to most Workspace tiers
The Verge: Microsoft is bundling its AI-powered Office features into Microsoft 365 subscriptions
No Priors: Ep. 89 with NVIDIA CEO Jensen Huang
TechCrunch: Anthropic teams up with Palantir and AWS to sell AI to defense customers
WSJ: QXO Hires an AI Chief to Help Sell Items Like Pipes and Lumber
FT: Employers look to AI tools to plug skills gap and retain staff
Business Insider: Anthropic is gaining on OpenAI in this key area of the AI market
Business Insider: Nvidia robotics executive tells BI how the company is predicting the future of robotics by building it
The Information: Google Accidentally Reveals ‘Jarvis’ AI That Takes Over Computers
The Verge: Did OpenAI just spend more than $10 million on a URL?
The Verge: Even Microsoft Notepad is getting AI text editing now
Wired: The $50 Million Movie Here De-Aged Tom Hanks With Generative AI
VentureBeat: Microsoft’s new Magnetic-One system directs multiple AI agents to complete user tasks
VentureBeat: AIRIS is a learning AI teaching itself how to play Minecraft
WSJ: If Your Tattoo Was Designed by AI, Does It Have a Soul?
TechCrunch: As generative AI gets better, what will happen to artists?
VentureBeat: Meet the startup that just won the Pentagon’s first AI defense contract
Reuters: Singapore's Keppel to buy Japanese AI-ready data centre
TechCrunch: Google is opening an AI hub in oil-rich Saudi Arabia
TechCrunch: Can Pictionary and Minecraft test AI models’ ingenuity?
Time: The Gap Between Open and Closed AI Models Might Be Shrinking. Here’s Why That Matters
MIT Technology Review: How ChatGPT search paves the way for AI agents
VentureBeat: Mike Verdu of Netflix Games leads new generative AI initiative
UKTN: GovGPT: British government launches AI chatbot for businesses
Wired: The Guy Behind the Fake AI Halloween Parade Listing Says You’ve Got It All Wrong
The Verge: Perplexity debuts an AI-powered election information hub
FT: Meta’s plan for nuclear-powered AI data centre thwarted by rare bees
Forbes: How AI Will Make It Easier To Understand Your Dental X-Rays
Forbes: This AI Model Could Keep Thousands Of Cancer Patients From Getting Unnecessary Treatments
The Information: Corporate Spending on OpenAI Threatens Salesforce, Other Enterprise Apps
VentureBeat: xAI woos developers with $25/month worth of API credits, support for OpenAI, Anthropic SDKs
WSJ: The Budget Hawk Atop a Tech Giant’s $64 Billion Spending Spree
The Economist: Why your company is struggling to scale up generative AI
Business Insider: Morgan Stanley's new innovation head lays out his plan for more OpenAI-type partnerships
Reuters: A year on, Intel's touted AI-chip deals have fallen short
Business Insider: Watch the AI robots that Jeff Bezos just invested in fold laundry and put eggs in a carton
VentureBeat: UC San Diego, Tsinghua University researchers just made AI way better at knowing when to ask for help
Business Insider: OpenAI's former head of 'AGI readiness' says that soon AI will be able to do anything on a computer that a human can
Business Insider: Sam Altman explains OpenAI's shift from open to closed AI models
VentureBeat: Nvidia AI Blueprint makes it easy for any devs to build automated agents that analyze video
Business Insider: Snowflake CEO explains why 'the insidious thing' about AI hallucinations isn't the occasional error
TechCrunch: OpenAI has hired the co-founder of Twitter challenger Pebble
Keep reading with a 7-day free trial
Subscribe to Computerspeak by Alexandru Voica to keep reading this post and get 7 days of free access to the full post archives.