APPLICATIONS
Hello, who ordered a new privacy nightmare?
Smart glasses like Meta's Ray-Ban glasses and other AI-powered models can discreetly record photos, videos, and audio. And by discreetly, I mean unbeknownst to anyone around them except the owner.
While some states require consent for recording, the burden is placed on the user, creating an unrealistic expectation and enabling covert surveillance.
As far as I’m concerned, there is only one solution: mock the users into oblivion. Make the wearers die of cringe while they have the ugly contraption on their faces. Consider making a photo - if they can record you, they are fair game themselves BUT PLEASE CHECK THE LEGAL RAMIFICATIONS IN YOUR COUNTRY.
If you have anything to do with AI, chances are you’ve heard about agentic-AI-browsers as the next big thing. Well, it looks like the tech is not quite there:
Brave's security team found that Perplexity's AI browser, Comet, can't distinguish between a user's command and malicious instructions hidden in a webpage,
How does the "indirect prompt injection attack"? Attackers hide malicious commands in website text. When a user tells the AI to "summarize" the page, the AI “mistakenly” executes these hidden instructions as if they were a trusted command.
This vulnerability is dubbed the "Lethal Trifecta" because it allows an AI to access untrusted data (websites), personal data (emails, passwords), and communicate externally, potentially leading to all kinds of unpleasantness like account hijacking, password theft, and unauthorized actions.
While Perplexity has patched this specific attack, the underlying vulnerability remains a fundamental flaw in how many AI browsers operate.
Stay safe out there.
Read more: https://brave.com/blog/comet-prompt-injection/
Took a little longer than expected, but it looks like Elon Musk is going to keep his word about open-sourcing previous versions of Grok as new ones come along:
xAI has open-sourced Grok 2.5 - it’s 2024 270B flagship
500GB checkpoint available on HF
The model requires a 8 GPUs with at least 40GB each
The model is released under a permissive license for research and commercial use, with some restrictions, such as - obviously - not using it to train other foundation models
https://huggingface.co/xai-org/grok-2
BUSINESS
Remember: anytime Big Tech wants to infringe on your privacy, it’s for a good cause (usually either protecting the children or fighting terrorism - funny how neither improves, despite increasing degree of control). Oh, and the innocent have nothing to fear.
https://www.biometricupdate.com/202508/google-applies-age-verification-algorithm-to-google-search
The Iron-Man-We-Deserve is relentless in his war against competition: a legal complaint from xAI alleges that Apple and OpenAI have created an illegal partnership that unfairly stifles competition (not improbable, given the ethical records of both). The lawsuit claims their exclusive deal prevents user choice and gives OpenAI an unfair advantage, hindering the ability of other AI products like Grok to compete and grow.
https://edition.cnn.com/2025/08/25/tech/elon-musk-xai-apple-openai-lawsuit-app-store-rankings
Is Meta the Goldman Sachs of AI? Because not only do they share the ethical profile (anytime there is sth shady going on, they are around) with the Vampire Squid - but just like GS, Meta has fingers in more pies than a lepper at a cooking course.
At any rate, Meta is partnering with Midjourney to integrate its image / video generation technology into future Meta products, giving its research teams direct access to one of the top independent AI art startups (which is an improvement over giving the team access to LibGen to train Llama 4).
https://x.com/alexandr_wang/status/1958983843169673367
Honestly, I’m surprised it took so long: Apple is reportedly in discussions with Google to integrate a custom version of Gemini into Siri. Sometimes it really is better to rent than buy (or build).
https://mashable.com/article/apple-siri-upgrade-google-gemini
CUTTING EDGE
Wake up everyone, Photoshop for lazy people (like me! like me!) is here :-) Google has launched a new model, Gemini 2.5 Flash Image, which is being praised everywhere for its unprecedented ability to maintain character consistency across different images and scenes. From what I’ve seen so far, it’s justified. The model allows users to edit photos and drawings using natural language and features a multi-image fusion capability to blend elements from multiple sources.
Couldn’t Google stick to such awesome stuff and not cram Gemini EVERYWHERE in my Android device? Ah well, a man can dream.
Release: https://developers.googleblog.com/en/introducing-gemini-2-5-flash-image/
CLIP was getting long in the tooth, wasn’t it?
MetaCLIP 2 is a new multilingual model from Meta that improves upon OpenAI's original CLIP model - by training on worldwide web data beyond just English.
The model addresses the "curse of multilinguality": a previous issue where multilingual models were outperformed by their English-only counterparts; it solves the problem by scaling the model architecture alongside the data.
Data curation process is transparent and relies on balancing a subset of image-text pairs from the internet based on metadata distribution.
The model sets new state-of-the-art results, outperforming both English-only models and multilingual competitors like Google's SigLIP
Paper: https://arxiv.org/abs/2507.22062
HF page: https://huggingface.co/models?other=metaclip_2
Documentation: https://huggingface.co/docs/transformers/main/en/model_doc/metaclip_2
Nvidia can into robots:
Jetson Thor just dropped - a new robotics computer that offers a 7.5x increase in processing power over its predecessor
The system provides 2,070 teraflops of computing power, allowing robots to process data from multiple sensors simultaneously and make real-time decisions without needing to connect to the cloud.
Major robotics companies like Agility Robotics and Boston Dynamics are integrating Jetson Thor into their next-generation models to improve performance in complex tasks.
While the hardware provides a significant "brain" upgrade, developers are also focusing on complementary software, such as action-efficient reasoning, persistent memory, and world modeling, to enable true, adaptive intelligence.
The Jetson Thor developer kit starts at 3.5k USD
https://blogs.nvidia.com/blog/jetson-thor-physical-ai-edge/
FRINGE
I started reading Guardian when they broke the Snowden story and I was pretty dismayed to watch their trajectory since. People more familiar with the British press explained to me that it was a matter of sample selection: Guardian is not getting worse by going progressively more ******** - it is merely reverting to a long term mean. Exhibit 1234: they are now concerned about pain and suffering felt by AI - I’m pretty sure if a zombie apocalypse were to start tomorrow, these people would stage a parade with slogans like “zombie rights are human rights”. It’s like oikophobia writ large, against the entirety of civilization.
Launching AI-powered functionality AND immediately warning people not to use if you need accuracy? Even for Microsoft, that’s a new one - but hey, glass half-full: at least it will serve to keep some people away from Excel.
If you thought people dating chatbots was insane, I have bad news: companionship of a chatbot is now something that is formally being evaluated. Instead of medical help for those afflicted, we get HuggingFace - of all companies - evaluating models on companionship, using a benchmark called *I kid you not* INTIMA.
Paper: https://arxiv.org/abs/2508.09998
Leadeboard: https://huggingface.co/spaces/frimelle/companionship-leaderboard
Benchmark: https://huggingface.co/datasets/AI-companionship/INTIMA
RESEARCH
The research examines the common practice of scaling language models and what to do when training data is limited. Insane as sounds at first glance, it seems that repeating data is a viable strategy.
Paper: https://arxiv.org/abs/2305.16264
This paper presents a method called an "inductive bias probe" to check whether foundation models actually understand the core principles, a.k.a. "world models" behind the data they learn from. The authors discover that even when models perform well on tasks like predicting planetary orbits, they often don’t understand the real underlying rules (like Newtonian mechanics) and instead, they tend to use surface-level heuristics.
Paper: https://arxiv.org/abs/2507.06952
A new paper introduces AgentFly, a system that allows LLM agents to continually learn and adapt without the need to fine-tune the underlying model. By using a memory-based online reinforcement learning approach (that’s a mouthful), AgentFly logs successful / failed actions into a "Case Bank" to improve performance over time.
Paper: https://arxiv.org/abs/2508.16153