LLMs used tactical nuclear weapons in 95% of AI war games, launched strategic strikes three times

2026年1月25日 · 孙亮 · 来源：dev资讯

飞檐翘角、灯笼高挂，中国传统风格装饰的市集里一片热闹喜庆，中沙两国文化、艺术与美食同场呈现。日前，由中国文化和旅游部与沙特文化部联合举办的“文化市集”活动在沙特首都利雅得举办，吸引众多观众。

我們需要對AI機器人保持禮貌嗎？，这一点在搜狗输入法2026中也有详细论述

靠“阴伟达” 救场？就在濒临绝境时，“阴伟达” 横空出世，成了救命的 “强心针”。

Remaining focused and ignoring the naysayers。业内人士推荐爱思助手下载最新版本作为进阶阅读

Google's N

During development I encountered a caveat: Opus 4.5 can’t test or view a terminal output, especially one with unusual functional requirements. But despite being blind, it knew enough about the ratatui terminal framework to implement whatever UI changes I asked. There were a large number of UI bugs that likely were caused by Opus’s inability to create test cases, namely failures to account for scroll offsets resulting in incorrect click locations. As someone who spent 5 years as a black box Software QA Engineer who was unable to review the underlying code, this situation was my specialty. I put my QA skills to work by messing around with miditui, told Opus any errors with occasionally a screenshot, and it was able to fix them easily. I do not believe that these bugs are inherently due to LLM agents being better or worse than humans as humans are most definitely capable of making the same mistakes. Even though I myself am adept at finding the bugs and offering solutions, I don’t believe that I would inherently avoid causing similar bugs were I to code such an interactive app without AI assistance: QA brain is different from software engineering brain.