Maligned - January 24, 2026

AI news without the BS

Here’s what actually matters in AI today. No fluff, no hype - just 5 developments worth your time.

Today’s Top 5 AI Developments

1. LLMs Get Agentic: Smarter, With a Sandbox 🤖

This research introduces “LLM-in-Sandbox,” enabling LLMs to explore within a virtual computer to tackle non-code tasks. It shows powerful LLMs can spontaneously leverage external resources, handle long contexts, and format outputs, even without specific training. This pushes LLMs closer to general agentic intelligence by enabling robust tool use and exploration.

Source: arXiv Link: https://arxiv.org/abs/2601.16206v1

2. Video Models Level Up Robot Control 🦾

NVIDIA’s “Cosmos Policy” transforms large pretrained video models into state-of-the-art robot policies without architectural changes. By encoding actions and future states as latent frames, it lets these models directly learn complex robot movements and plan trajectories. This approach significantly outperforms existing methods, making video models efficient brains for advanced robotics.

Source: arXiv Link: https://arxiv.org/abs/2601.16163v1

3. AI That Discovers, Not Just Generates 💡

“Test-Time Training to Discover” (TTT-Discover) lets LLMs perform reinforcement learning at inference time to actively find new state-of-the-art solutions for specific scientific problems. It’s not just generating; it’s training to discover novel solutions in math, GPU kernels, algorithms, and biology, setting new benchmarks with open models. This flips the script on how we use LLMs for research.

Source: arXiv Link: https://arxiv.org/abs/2601.16175v1

4. Making Multimodal LLMs Adversary-Proof 🛡️

A new method, Feature-space Smoothing (FS), significantly boosts the certified robustness of Multimodal LLMs (MLLMs) against adversarial attacks. It theoretically proves that FS can maintain feature integrity, reducing attack success rates from around 90% to just 1% without retraining the MLLMs themselves. This is a critical step for deploying trustworthy and secure multimodal AI systems.

Source: arXiv Link: https://arxiv.org/abs/2601.16200v1

5. Text-to-Image Just Got Faster & Better 🎨

Representation Autoencoders (RAEs) are proving to be a stronger, simpler foundation than VAEs for large-scale text-to-image (T2I) diffusion transformers. This research shows RAEs lead to faster convergence, better image generation quality, and improved stability across various model scales. It’s a fundamental improvement making T2I models more efficient and powerful, opening doors for unified multimodal models.

Source: arXiv Link: https://arxiv.org/abs/2601.16208v1

That’s it for today. Stay aligned. 🎯

Maligned - AI news without the BS