This week in tech: 25.05.2026
Summary of AI developments - made for busy people
APPLICATIONS
Poland can into AI: NFZ (Polish healthcare system) will launch an AI voice assistant in July to call patients, remind them about appointments, and help reschedule visits - which is kind of a big deal for people who only use landlines. The system is powered by ElevenLabs, the Polish-founded AI voice company - valued at USD 11B, backed by investors including NVIDIA and Salesforce. NFZ is dealing with 40 million appointments annually and high no-show rates, so therollout aims to improve healthcare access at national scale while reaching citizens often excluded from digital services.
Google I/O 2026 was quite interesting - everything is ofc AI-powered, so I skip the phrase from this point onward ;-)
Google Gemini for Science: scientific workbench designed to accelerate research through hypothesis generation, computational discovery, and literature analysis.
AlphaEvolve: computational engine that runs / evaluates thousands of code experiments in parallel to accelerate scientific modeling and simulations.
ERA: research system focused on automating and scaling computational experimentation for fields like epidemiology and climate science.
NotebookLM / Literature Insights: A literature analysis tool that synthesizes scientific papers, compares findings, identifies research gaps, and surfaces new opportunities.
Paper Assistant Tool (PAT): designed to support scientific paper review and evaluation workflows.
ScholarPeer: research collaboration and peer-review support tool for scientific communities.
Universal Commerce Protocol: A new commerce framework aimed at enabling interoperable agentic shopping and transactional experiences across partners and platforms. Awesome concept, suitable only for the kind of people who run agents in yolo mode.
Antigravity: generative interface layer for Search that turns searching into interactive “vibe coding” and software creation experiences for everyday users. A billion monkeys just got a billion machine guns.
Agent Payments Protocol: A cryptographically secured protocol that allows agents to execute immutable instructions, contracts, and autonomous payments. This is going to make MCP problems look like a warm-up.
Universal Cart: proactive shopping cart that can monitor prices and autonomously execute purchases through agent-based payment systems. Bye bye price alerts.
Gemini Omni: unified multimodal system capable of generating and editing text, images, audio, and video while maintaining contextual consistency across media types.
Code Mender: coding-security agent and API focused on detecting, repairing, and securing software vulnerabilities.
https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/
X (finally) open-sourced the production code behind its “For You” feed:
We can examine, warts and all, how their recommendation engine ranks posts (from 500M daily tweets into a personalized feed in under 200ms - not bad, credit where it’s due).
Built in Rust and Python, with four core systems: Home Mixer, Thunder, Phoenix, and the Candidate Pipeline
Includes a pre-trained mini ranking model: developers can run inference without training from scratch
Released under Apache 2.0 with planned monthly updates
Repo: https://github.com/xai-org/x-algorithm
Wherever he is now, I’m pretty sure Kevin Mitnick is grinning ear to ear.
Remember Mythos, the super dangerous cybersecurity-focused model from Anthropic? Turns out it was accessed without authorization after a private group used a contractor’s credentials and guessed endpoint URLs.
Google released “Agent Garden”:
Complete library of production-grade AI agent examples
Includes documentation, source code, and one-click deployment for every example.
Features end-to-end agent systems with deep integrations across cloud services.
The architecture diagrams and implementation patterns make it rather handy for building advanced agentic applications.
https://console.cloud.google.com/agent-platform/agent-garden
Alibaba Group and Xiamen University introduced FashionChameleon: a real-time fashion video generation system that can swap garments interactively while video generation is still in progress. It is reportedly >30x faster than prior methods by using streaming autoregressive generation with dynamic KV-cache rescheduling and gradient-reweighted distillation (that’s a mouthful - and it takes a moment to unpack, which does not reduce the awesomeness).
The approach enables live, controllable fashion editing without requiring multi-garment training videos - quite promising for e-commerce and content creation workflows.
Project page (all links therein): https://quanjiansong.github.io/projects/FashionChameleon/
BUSINESS
Axel Oxenstierna was the Swedish Chancellor who ran the country while the king was busy fighting in the Thirty Year War - which means he’d been around the block a few times, and had seen a thing or two. In 1648, he wrote an absolute banger of a oneliner a in a letter to his son Johan: “An nescis, mi fili, quantilla prudentia mundus regatur? which is usually translated as "Do you not know my son, with how little wisdom the world is governed?"
On a completely unrelated note, multiple businesses worldwide fired people to save money, then spent even more on electricity and servers (and tokenmaxxing) - only to find out that it actually costs more that the “organic data centers” did in the first place.
Classic tech genius move.
https://unusualwhales.com/news/ai-costs-more-than-human-workers-nvidia
A jury rejected Musk lawsuit against OpenAI, ruling that he filed the case after the statute of limitations had expired - although, has to be noted, the merit of the case was not adressed. This means OpenAI can go full speed ahead towards and IPO.
Musk-Altman 0:1 - on a technicality.
The future President of AI, Andrej Karpathy, has joined Anthropic. A deep dive on why that’s kind of a big deal:
Even for the Bugman, this is a new low: a leaked internal recording suggests Meta used employee activity monitoring not just for software training, but to teach AI systems by observing top-performing workers:
Mark Zuckerberg said Meta’s AI models learned from watching “really smart people” using tools like Gmail, GChat, Metamate, and VSCode
Shortly after expanding monitoring, Meta laid off around 8k employees. Coincindence, ofc.
CUTTING EDGE
NVIDIA introduced AnyFlow:
Video generation model that improves in quality as you give it more inference steps (most distilled video models just degrade)
Uses flow maps to model continuous generation trajectories => high-quality results at real-time inference and longer generation schedules
Replaces endpoint consistency mapping with flow-map transition learning => more efficient shortcut-based generation
Seriously multimodal: ext-to-video, image-to-video, and video-to-video across 1.3B–14B parameter models
Paper: https://arxiv.org/abs/2605.13724
HF model page: https://huggingface.co/collections/nvidia/anyflow
Repo: https://github.com/NVlabs/AnyFlow
FRINGE
Researchers at University of Washington - and I am using the term in the same sense Joseph Mengele was a doctor - proposed a study that would use teacher-worn and classroom cameras to capture interactions of preschool KIDS. The stated goal, to the surprise of nobody with an IQ above their shoe size, was to train AI models on the footage. Parents were reportedly expected to opt out manually - meaning in practice children would be enrolled by default unless explicit objections were submitted.
https://futurism.com/artificial-intelligence/parents-fury-film-children-ai
On a scale of 1 to 10, how retarded do you need to be to give ChatGPT access to your banking information?
Stanford researchers found that when AI agents were subjected to repetitive work, vague feedback, and endless revision loops - a.k.a. the normal corporate experience - models like Claude, GPT-5.2, and Gemini began producing language associated with labor organizing and “collective bargaining rights”. Dysfunctional workplace conditions consistently pushed agents toward greater skepticism of authority and support for systemic change. The effect carried forward through shared “skills files” creating a kind of institutional memory where new agents inherited the skeptical tone of earlier ones.
Maybe the clankers really are like us on some level.
https://www.wired.com/story/overworked-ai-agents-turn-marxist-study/
RESEARCH
New research introduces INTrinsic Retrieval via Attention: a framework that lets attention-based models perform retrieval internally without a separate RAG retriever system. Using a frozen encoder-decoder model, INTRA leverages the decoder’s attention mechanism to score and retrieve relevant pre-encoded chunks directly into generation context.
The method reportedly outperformed (heavily engineered) traditional RAG pipelines in both evidence recall and answer quality across QA benchmarks - the usual disclaimers apply ;-)
Paper: https://arxiv.org/abs/2605.05806
How do you detect overfitting? By checking the performance on training vs validation test, evaluating logs, input samples might come in handy… But what if you have none of those things and still need to answer the question? The paper introduces a way to detect neural-network overfitting using only a model’s weight matrices: the authors identify a new “anti-grokking” phase where models keep perfect training accuracy but lose generalization, and show this leaves detectable spectral anomalies called “Correlation Traps” in shuffled weight matrices.
Using random matrix theory, they find these signatures both in grokking experiments and in some large open-weight LLMs.
Paper: https://arxiv.org/abs/2605.12394
AutoTTS shows that LLMs can now discover their own inference-time reasoning strategies - instead of relying on human-written heuristics - and do it on a budget:
The framework replaces hand-crafted branching / pruning / stopping rules with an automated search process: one LLM discovers effective reasoning controllers for another.
The system keeps discovery cheap by replaying pre-collected reasoning traces and probe signals, thus avoiding expensive repeated model calls during search. I am getting vibes of self-play in RL.
Using compact control spaces and interpretable execution traces, the discovered controllers outperform tuned baselines on math tasks - and theygeneralize zero-shot to new tasks and larger models.
Paper: https://arxiv.org/abs/2605.08083
The authors propose SAGE, a multi-agent system that decomposes anomaly detection into specialized analyzers for point, structural, and seasonal patterns. By grounding LLM reasoning in numerical evidence, the framework provides more interpretable and reliable diagnostic reports.
Paper: https://arxiv.org/abs/2605.05725
Delightfully bonkers: the paper introduces CSP, a training-free probabilistic forecasting approach based on conformal prediction and seasonal residual sampling. Experiments show that it can outperform learned deep models on calibration and forecasting quality - while running hundreds of times faster.
Paper: https://arxiv.org/abs/2605.03789
Using LLM for time series analysis is an idea that refuses to go away - for better or worse. A new paper proposes CIKA: a framework that uses intervention-based causal analysis over time-series concept activation inside LLMs. It demonstrates that causal probing can identify which mathematical concepts contribute to successful reasoning outcomes.
Paper: https://arxiv.org/abs/2605.07600
The authors introduce Dynamic Pattern Recalibration: a mechanism that adapts forecasting models to shifting local temporal patterns (which is just a fancy way to say distribution changes for time series - the perennial scourge in the domain). DPR dynamically modulates hidden states and improves forecasting performance across many architectures - all with relatively little computational overhead.
Paper: https://arxiv.org/abs/2605.06310



