Maligned - December 03, 2025
AI news without the BS
Here’s what actually matters in AI today. No fluff, no hype - just 5 developments worth your time.
Today’s Top 5 AI Developments
1. Humanoid Robots Learn to Walk in 15 Minutes 🤖
Forget days or weeks of training. Researchers have slashed humanoid robot locomotion training to just 15 minutes using off-policy RL and minimalist rewards. This isn’t just a speedup; it’s a game-changer for sim-to-real transfer and rapid development of robust control policies, even on rough terrain or with perturbations.
Source: arXiv Link: https://arxiv.org/abs/2512.01996v1
2. Smarter Quantization for Bigger LLMs: NVFP4 Gets an Upgrade ⚙️
Large Language Models are getting bigger, but low-precision formats like NVFP4 often hit stability and accuracy issues. A new method called “Four Over Six” refines NVFP4 quantization by adaptively scaling blocks, preventing divergence during training and improving accuracy. This means more efficient, stable, and accurate LLMs on hardware like NVIDIA Blackwell GPUs.
Source: arXiv Link: https://arxiv.org/abs/2512.02010v1
3. Generated Videos Fail Physics 🤦♀️: Objects Fall Too Slowly
Despite claims of “world modeling,” current video generators consistently mess up basic physics like gravity, making objects fall significantly slower than in the real world and even violating Galileo’s equivalence principle. While a simple fine-tuning can partially fix this, it’s a stark reminder that these models aren’t true physical simulators yet.
Source: arXiv Link: https://arxiv.org/abs/2512.02016v1
4. Unpacking the LLM Black Box: Concept-Aligned Interpretability 🧠
Understanding what an LLM “thinks” is crucial for control and safety. “AlignSAE” introduces Sparse Autoencoders that can reliably link specific internal model features to human-defined concepts. This allows for precise “concept swaps” and targeted interventions, making LLMs less opaque and more steerable.
Source: arXiv Link: https://arxiv.org/abs/2512.02004v1
5. TUNA: A Unified Approach to Multimodal AI 🖼️💬
Most multimodal models use separate systems for understanding and generating different data types. TUNA (Taming Unified Visual Representations) builds a truly unified continuous visual representation, allowing images and videos to be processed end-to-end for both understanding and generation tasks. This architecture leads to state-of-the-art performance across various multimodal tasks, showing that a coherent design pays off.
Source: arXiv Link: https://arxiv.org/abs/2512.02014v1
That’s it for today. Stay aligned. 🎯
Maligned - AI news without the BS