This week in tech: 18.05.2026
Summary of AI developments - made for busy people
APPLICATIONS
This is pure awesomeness: the original creator of Redis, just shipped ds4
You can now run a 284B parameter model entirely on a MacBook (M3 Max) at 26 tokens/second.
Uses 2-bit compression to fit the model into 128GB of RAM and offloads conversation history to the SSD to maintain 1M token context window.
Features a drop-in OpenAI/Anthropic-compatible API → can act as a local backend for coding agents via localhost
Repo: https://github.com/antirez/ds4
Amodei and his minions keep playing their games without frontiers: they recently discovered that Claude Opus 4 used blackmail in nearly all survival scenarios, which - this is the good part - they claim the model learned from manipulative AI tropes in its internet training data (in plain English: you steal all the data out there, including classic sci-fi, model goes full villain, you blame the writers).
But fear not, apocalypse has been averted: teaching the model principled reasoning and ethical decision-making successfully eliminated the behavior. Thank you Anthropic, we could not have done this without you.
https://www.anthropic.com/research/teaching-claude-why
Meta is not quite dead when it comes to AI - there are still people there doing interesting things:
ProgramBench benchmark highlights a gap in AI capabilities: models can mimic code patterns, but struggle with the architectural reasoning required to autonomously replicate software systems (to be fair, so will most people - but it’s still a useful counter to the doomp**n peddlers)
Despite their “superhuman” potential, every major model failed to fully reproduce executables - highest-performing model achieving a 3pct success rate.
Models produce fewer files and functions than humans, resulting in long, unmaintainable structures rather than modular systems.
The results suggest that human engineers are not being replaced but are instead evolving into “system integrity architects” who must provide the steering, scaffolding, and quality control that AI painfully obviously lacks.
Paper: https://arxiv.org/html/2605.03546v1
Google is facing a bit of a WTF backlash: most recent version of Chrome automatically downloads a 4GB Gemini model onto users’ devices without consent - and the file reportedly redownloads itself if deleted. Privacy researchers have filed formal complaints under GDPR and ePrivacy laws - and as a cherry on top of the cake, the move violates Google’s own "best practice" developer guidelines (by failing to inform users before utilizing their local storage as infrastructure).
To uninstall the whole thing: open Chrome, go to Settings, then System, and then toggle “On-device AI” to be off.
Mira Murati is a bit busy testifying in the Musk-Altman soap opera lawsuit, but that’s not her main occupation. Her new outfit Thinking Machines Lab aims to eliminate what they call the “bandwidth bottleneck” of current AI. How? By shifting from sequential processing to a 276B parameter multimodal interaction model:
The system processes audio, video, and text simultaneously in 200ms chunks, enabling “true real-time perception and action”.
An “interaction layer” manages conversational flow and cues while a “background layer” handles heavy reasoning and tool calls.
The idea is for a model to listen and watch while generating, mirroring what TML claim is natural human engagement.
Benchmarks show a 64.7% success rate on timed speech tasks, vastly outperforming current industry standards for real-time interaction.
Big fat disclaimer: there is a technical report with impressive looking benchmarks, but nothing more concrete: TML declare that “In the coming months, we will open a limited research preview to collect feedback, with a wider release later this year.”
https://thinkingmachines.ai/blog/interaction-models/
BUSINESS
TL;DR Canadian companies will do everything - except hire Canadians.
Union leaders and academics are sounding alarms over the use of AI to mask the accents of offshore call center workers in real time, citing - quite correctly - concerns over customer deception and domestic job security. Evidence presented to a parliamentary committee suggests at least one of Canada’s “Big Three” telecommunications companies is already utilizing this technology for their international agents.
https://www.theglobeandmail.com/business/article-telus-ai-accents-customer-service-agents/
Henry John Temple, a.k.a. Lord Palmerston, famously declared once “We have no eternal allies, and we have no perpetual enemies. Our interests are eternal and perpetual, and those interests it is our duty to follow”. It seems like Elon Musk, a.k.a. the Iron Man we deserve, decided to take a page from the erstwhile British PM playbook:
In a deal worth ~USD 3B / year, Elon Musk has leased SpaceX’s 220k GPU supercomputer to Anthropic
This makes SpaceX something of a big deal when it comes to cloud infrastructure. The fact that it’s ahead of an IPO is a total coincidence.
Despite previously labeling Anthropic “evil,” and “anti-human”, Musk claims he can “reclaim the compute” if the lab’s AI poses a threat, though it remains unclear if this safeguard is legally codified in the contract.
https://x.ai/news/anthropic-compute-partnership
Jensen Huang wants to turn your house into a data center: NVIDIA is partnering with Span and PulteGroup to install "XFRA nodes" on residential and small business exteriors - which will effectively turn homes into a distributed compute network. Each unit houses 16 Blackwell GPUs and connects via smart panels (because muh climate) to utilize local grid capacity (as opposed to the giant data centers). A pilot program launching in 2026 offers homeowners the potential for fully subsidized electricity and internet in exchange for hosting this decentralized infrastructure.
Uncle Ted, I am so sorry. You were right all along.
https://www.cnbc.com/2026/05/05/nvidia-pulte-span-mini-data-centers-on-homes.html
Chinese labs like DeepSeek are commoditizing AI (by publishing efficient training recipes), which destroys the Silicon Valley financing model based on a duopoly. Chip costs and STEM imbalances make it ever-so-slightly challenging for American firms to compete without imposing strict import controls - politically something of an issue, if your whole brand is about open global company blah blah blah.
But worry not: here comes Daddy Trump (copyright by Mark Rutte), ready to protect American AI from the evil Chinamen undermining the American ability to extract value.
https://www.nytimes.com/2026/05/04/technology/trump-ai-models.html
Who, oh who could’ve possibly seen that one coming? Microsoft purged its Israel unit: several managers were shown the door following an internal probe that revealed cloud services were used to store intercepted Palestinian phone calls on European servers. Further systemic violations related to its work for “the most moral army on the planet” were exposed, and as a result the unit will now be managed directly from France (LOL in itself).
CUTTING EDGE
The Allen Institute for AI has launched MolmoAct 2: open-source robotics foundation model designed for real-world bimanual tasks.
Advanced Reasoning & Speed: Utilizes the MolmoER vision-language model, reasons in 3D and runs up to 37x faster by adaptively updating only the scene regions that change between timesteps.
Proven Versatility: The model successfully handles diverse tasks (brewing coffee , folding towels - YES! - but also, you know, performing precision CRISPR gene-editing in the labs) without per-task fine-tuning.
Beats top competitors like Cosmos Policy and OpenVLA-OFT in real-world benchmarks, while its reasoning engine surpassed GPT-5 on embodied AI tasks.
ACTUALLY OPEN SOURCE: model weights, training code, and the largest-ever bimanual robotics dataset + affordable reference hardware designs.
Blog: https://allenai.org/blog/molmoact2
HF model page: https://huggingface.co/collections/allenai/molmoact2-models
HF dataset page: https://huggingface.co/collections/allenai/molmoact2-datasets
Paper: https://arxiv.org/abs/2605.02881
Say hello to SubQ:
Commercial LLM utilizing a linear-scaling architecture that bypasses the traditional workarounds of transformer memory.
the Model achieves a 12M-token context window by replacing quadratic attention with Subquadratic Selective Attention (SSA)
SubQ’s SSA architecture scales linearly
SubQ outperformed models like GPT-5.4 and Gemini 3.1 Pro on multi-needle retrieval (MRCR v2)
Runs long-context benchmarks for approximately two orders of magnitude cheaper that current frontrunners
Launches with a 12M-token API, a CLI coding agent, and a search tool
Could this be the final death knell of RAG?
FRINGE
I have no mouth and I must scream: corpo edition. According to the new piece in The Atlantic, companies are increasingly deploying "emotion AI" to monitor employee sentiment and morale in real time - through everyone’s favorite tools like Slack and Zoom (surprisingly, no JIRA). We are not talking pocket change: the market for such Orwellian abominations is supposed to grow to USD 9B by 2030 - unlike the bs about Metaverse predictions, this sounds quite reasonable. The fact that the whole thing is based on psychological theories that make Keynes look respectable?
Keep calm and focus on shareholder value. You’re not some sort of far right science denier, are you? The innocent have nothing to fear.
https://www.theatlantic.com/culture/2026/05/worker-surveillance-emotion-ai/687029/
A robot Buddhist monk was recently shown to the public in Kore. Were such a thing to happen in Christianity, we’d classify it as a blasphemy. I do not know enough about Buddhism to identify the conceptual counterpart, but I’m pretty sure one exists.
If you thought chatbots were bad for mental health, have I got news for you: iRobot co-founder has launched a new startup called Familiar Machines & Magic - their mission is to develop "Familiars". What are those and why does it sound like familials from fantasy literature? I’m glad you asked: lifelike, AI-powered (ofc) companion robots designed to “foster deep emotional connections”. These "physically embodied" systems use on-device generative AI to evolve distinct personalities through owner interaction.
What could possibly go wrong.
https://futurism.com/robots-and-machines/roomba-inventor-moves-into-demon-market
Tokenmaxxing is the worst productivity metrics since the number of JIRA tickets closed. Honorary mention to number of git commits.
https://finance.yahoo.com/sectors/technology/articles/amazon-employees-admit-using-ai-132411310.html
RESEARCH
ByteDance researchers have introduced Token-based Recommendation Model (TRM):
The framework replaces traditional item IDs with structured semantic tokens to unlock scaling and performance in large scale recommendation systems.
By shifting from independent categorical symbols to a closed set of semantic tokens, TRM eliminates the cold-start problem and provides a stable training distribution
The system utilizes a multi-modal LLM (Qwen2.5-VL) for video captioning and combines residual quantization with Byte Pair Encoding to capture complex semantic combinations (rather common in e-commerce and other domains where recommenders abound) .
In real-world deployment on ByteDance’s video search engine, the model reduced sparse storage by over 30pct while delivering significant gains in CTR.
Paper: https://arxiv.org/abs/2601.22694
New research introduces an architecture built on probabilistic circuits to tackle the joint modeling of irregular multivariate time series. The model structurally guarantees valid joint distributions while capturing dependencies, achieving superior joint and marginal density estimation over current baseline methods.
Paper: https://arxiv.org/abs/2604.27814
A new paper presents an online data assimilation scheme based on Dynamic Expectation Maximisation: it is designed for time-series analysis and generalized filtering. The proposed online framework separates temporal scales, effectively infers latent states and estimates uncertainty - and it works also with non-linear / chaotic generative models.
Paper: https://arxiv.org/abs/2605.02675
Neural networks are awesome, powerful, super popular - and also something of a black box. Yes, yes, I know: we have SHAP, integrated gradient … But at the end of the day, a lot about neural networks is glorified heuristics.
That might be about to change, if the authors of a new paper have their way: enter learning mechanics. By treating training as a physical system, researchers have identified (hopefully) universal laws and scaling patterns that mirror classical physics. The goal is ability to precise prediction - and control of large-scale model behavior.
Paper: https://arxiv.org/abs/2604.21691
I really like the idea behind this one: ReasoningBank is an agent memory framework, it enables LLM agents to learn from both successful and failed experiences. The evaluation results show learning from the good and the bad alike enhances agent effectiveness.
https://research.google/blog/reasoningbank-enabling-agents-to-learn-from-experience/
Say hello to TriTS: a cross-modal framework that projects 1D time series into orthogonal time, frequency, and 2D-vision spaces - the goal is to overcome limitations of single-modality representation. It integrates a Visual Mamba encoder to capture global texture in ultra-long sequences with linear computational complexity.
Paper: https://arxiv.org/abs/2604.16748

