This week in tech: 8.06.2026
Summary of AI developments - made for busy people
APPLICATIONS
Anthropic have changed their tune - again. This week they are not scaring us with Terminator wiping out humanity, but instead focused on benefits of AI building itself:
A new report from suggests AI systems are becoming increasingly capable of contributing to their own development - recursive self-improvement becoming more science and less fiction
Anthropic says over 80pct of code merged into its production systems is now written by Claude (correlated with the reported performance decrease in 4.8? Who knows)
The report shows rapid gains in autonomous research and engineering
https://www.anthropic.com/institute/recursive-self-improvement
If you thought AI girlfriends were bad for society, you better sit down for the next bit (I had to): as things stand now, you can use an LLM to design a DNA sequence, then ask any lab to create it, no matter how dangerous - think Warburg virus (basically Ebola++), and available to Reddit brains.
Terrified? Good, you should be. So are the tech moguls, apparently: the CEOs of OpenAI, Anthropic, and Microsoft have signed an open letter to the US Congress urging lawmakers to mandate screening of synthetic DNA and RNA orders. Not that I trust the government, but this is a bit like nuclear weapons - we cannot go back in time and shoot Oppenheimer and Heisenberg to uninvent them, so regulating the daylight out of them seems like the next best thing.
The grim irony? US government is as not exactly know for its efficiency when it comes to regulation (pretty much the opposite of the EU, which doesn’t know how to do anything else), so an executive order by Agent Orange might the only realistic option.
Either that, or we will really live to see man made horrors beyond our comprehension.
Nvidia keeps cooking - they just made object detection way faster:
Say hello to LocateAnything, a visual grounding model that predicts an entire bounding box in a single parallel step
Major upgrade from generating coordinates one token at a time - historically a bottleneck in vision-language models.
Result: 12.7 boxes/sec on an H100 (vs. 1.1 for Qwen3-VL), improving F1 scores on LVIS
Hybrid fallback switches to sequential decoding only when needed - food for thought: structured AI outputs may be unnecessarily constrained by text-style autoregressive generation rather than being predicted directly. When all you have is a hammer…
Project page: https://research.nvidia.com/labs/lpr/locate-anything/
HF space: https://huggingface.co/spaces/nvidia/LocateAnything
Paper: https://arxiv.org/abs/2605.27365
Any list of greatest movie directors has to include Martin Scorsese somewhere at the top, so when such a living legend embraces AI as part of his creative process - storyboarding, to be precise - it’s kind of a big deal, you know? Congratulations to Black Forest Labs (the people responsible for Flux) on this most excellent collaboration.
https://bfl.ai/martin-scorsese-bfl-advisor
One of the things I love about the Ginger Caligula is that he took the imperial presidency - a decades long trend - to its logical conclusion and just ran with it (and in the process confirmed every single incorrect unfair biased shameful wrongthink residing in my head on the topic). Last week he signed an executive order asking AI companies to voluntarily provide their frontier models to the federal government up to 30 days before public release - and yes, I am aware of the cognitive load of having “asking”, “voluntary”, and “executive order” in the same sentence.
https://www.cnbc.com/2026/06/02/trump-executive-order-ai.html
Microsoft took a long hard look at their threesome-ish thing with OpenAI / Anthropic - specifically at the bill - and decided that in the buy-or-build decision, it makes sense to switch:
Say hello to MAI model family: work with text, image, voice, and speech
MAI-Code-1-Flash is optimized for coding workflows
MAI-Thinking-1 targets more complex reasoning and software engineering tasks
MAI Image model debuted near the top of benchmark rankings (if you still trust those).
Announcement: https://microsoft.ai/news/introducingmai-code-1-flash/
Ok Google, this is bad: apparently Gemini can be manipulated via indirect prompt injection hidden inside normal-looking messages, allowing attackers to influence the assistant without user interaction. Researchers embedded malicious instructions across a variety of apps (WhatsApp, Slack, Signal, SMS, Instagram, Messenger), and Gemini’s Android agent processed them through notification context, leading to silent execution of attacker commands.
The technique is called “Fake Context Alignment” and it allows the attacker to disguise malicious prompts as legitimate conversation content, enabling data theft, phishing preparation, and other dirty deeds - despite existing safety mitigations.
https://www.safebreach.com/blog/gemini-voice-assistant-prompt-injection-exploit/
BUSINESS
Never interrupt your enemy when they’re making a mistake: as Google continues with the ens**ttification of search (replacing the search box with a “conversational engine”), rivals - such as they are - are seeing an uptick in user interest:
DuckDuckGo reported U.S. app installs rising 18pct week-over-week
Similar patterns hold for Brave Search, Kagi, and - funny enough - Microsoft Bing
Google argues the transition is being driven by demand, because the company would never ram stuff down users’ throats. Perish the thought.
The whole conversation seems like a harbinger of things to come: Google is integrating AI by default, while competitors are emphasizing optionality and AI-free alternatives.
Bernie Sanders, the erstwhile hope of American progressives, has announced a plan: he wants to seize half the AI industry for the public good. I guess he deserves points for two things:
Originality - he did not invoke defending children or fighting terrorism, which are the usual goto lines for the non-producing class
Restraint - half is less than 90pct, which is what his ilk want to take from the “the rich” e.g. in the Netherlands.
My favorite fossil says AI is built on the collective knowledge of humanity, so we deserve a cut, and which point even a cynic and misanthrope like me gives up… Listen: 40pct of training data used by the LLM comes from Reddit - if this is representative of the collective knowledge of humanity, it’s not exactly something to brag about.
Then again, Reddit brain types are heavily overrepresented on his side of the political spectrum, so maybe I’m just not in the target group? Either way, congratulations comrade Sanders. You forced me to defend Anthropic.
https://futurism.com/future-society/bernie-sanders-plan-seize-ai-industry
From the fulness of the heart, the mouth speaks - the Microsoft edition: an internal document reportedly / allegedly / whatever obtained by 404 Media suggests Nadella’s minions describe the first phase of its Scout AI assistant (always-on autopilot, built on OpenClaw) strategy as “make people addicted”. The language has sparked concern among some employees, who are unused to such level of honesty and who argue about chatbot overuse and potential psychological harms (which the management is totally prioritizing, pinky swear).
The Dutch government has created a new identification app: the NL Wallet. It shows their undying commitment to European tech sovereignty: it does not work without either a Google or an Apple account (for the cognitively diverse: both are American companies, subject to the Cloud Act).
https://www.nldigitalgovernment.nl/overview/identity/id-wallet/
CUTTING EDGE
MiniMax has released M3:
Ppen-weight model designed for long-context reasoning, multimodal input, and high-performance coding and agentic tasks.
1M-token context window, native multimodality (image and video inputs)
Uses MiniMax Sparse Attention (MSA), which enables reduced compute costs for long contexts while maintaining competitive / superior performance
Pricing is highly attractive: 30 cents per 1M input tokens and 1.20 USD per 1M output tokens
The release is currently API-first via MiniMax and OpenRouter, with open weights and full technical details expected to follow shortly.
Release: https://www.minimax.io/blog/minimax-m3
API: https://platform.minimax.io/docs/guides/models-intro
FRINGE
I’m not saying the Butlerian Jihad is imminent, but between the new encyclical, Iran declaring data centers legit military targets, and now a robot kicking a child? The vibes are certainly intensifying.
https://interestingengineering.com/ai-robotics/viral-humanoid-robot-kicks-child-in-stomach
RESEARCH
If you are serving long-context LLMs or agentic systems, this is worth a look: a new research from Huawei handles the problem of error accumulation during long reasoning tasks. When you quantize KV caches with autoregressive decoding, small errors compound - by the time you reach the end of a long math proof or a multi-step agent workflow, the model has drifted. Enter KVarN: combination of Hadamard rotation and iterative variance normalization across **both keys and values**. The numbers look good: on benchmarks like MATH500 and AIME24, the new approach hits FP16-level accuracy **at 2-bit precision**.
And a cherry atop this particular cake is that it’s built as a native vLLM attention backend, so turning the whole thing requires one argument change - no calibration data, no model surgery.
Paper: https://arxiv.org/abs/2606.03458
One of the biggest “aha” moments in my education came years ago, when I read the functional analysis take on time series forecasting: long story short, once you define your L2 space spanned by historical observations, Hilbert projection theorem gives you an optimal forecast.
Years later, I’ve had a similar experience with ML: a new paper argues that gradient-boosted decision trees, kernel regression, and self-attention can all be viewed through a common framework: learn a similarity geometry, attach values to neighborhoods within that geometry, and aggregate those values into predictions.
This means GBDTs become kernel methods with learned (partition-based) geometries and value embeddings, while self-attention emerges as a continuous relaxation that shifts most modeling power into learned values and decoders. The framework shows that exact GBDT, ridge, and neural leaf-value variants perform similarly well - suggesting these different paradigms are variations of the same underlying predictive operator.
Paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6823639
Excellent deep dive:
A new paper introduces QuITE: a plug-and-play module designed to resolve the bottleneck of modeling irregularly sampled multivariate time series without relying on data interpolation. It utilizes learnable query tokens through a single self-attention layer, then generates uniform latent representations that boost downstream forecasting / classification performance.
Paper: https://arxiv.org/abs/2605.28166
The authors of a new research solve one of the classic issues in time series modeling: how to account for rare external disturbances - and do so by introducing a multimodal counterfactual forecasting method. The core idea is to be blend structured temporal history with arbitrary text inputs: this way the system reliably alters its forecasts based on hypothetical real-world event scenarios.
Paper: https://arxiv.org/abs/2605.14422
f


