OpenAI expands Codex to 'almost everything' — 874 HN pts, 449 comments on agentic coding agent upgrade
·OpenAI
OpenAI publishes 'Codex for almost everything', a major capability expansion for its Codex coding agent. The post details how Codex can now handle a far broader range of software engineering tasks end-to-end, including autonomous debugging and deployment steps. A companion demo 'Codex Hacked a Samsung TV' shows the agent autonomously reverse-engineering and exploiting a consumer device — drawing 100+ HN points. HN main thread: 874 pts, 449 comments on launch day.
Codex is OpenAI's direct answer to Claude Code and GitHub Copilot Workspace — an agent that completes whole programming tasks, not just completions. Expanding it to 'almost everything' and demonstrating autonomous device hacking marks a capability step that will reshape how software is written and tested. The Samsung TV demo is proof that AI agents can now handle adversarial real-world targets, not just greenfield code.
Impact scorecard
7.54/10
Stakes
8.0
Novelty
7.0
Authority
9.0
Coverage
7.0
Concreteness
6.0
Social
8.0
FUD risk
2.0
Coverage12 outlets · 4 tier-1
OpenAI, HN, TechCrunch, The Verge, Ars Technica
X / Twitter2,200 mentions @sama · 1,800 likes
Reddit1,600 upvotes r/OpenAI
r/MachineLearning, r/programming, r/OpenAI
Trust check
high
Official OpenAI blog post. Cross-confirmed by HN discussion. Demo linked directly. No anonymous sourcing or FUD flags.
A developer reports a €54,000 unexpected billing spike in just 13 hours after a Firebase browser key without API restrictions was used to make Gemini API requests — presumably by a malicious third party. The Google AI developer forum post goes viral with 386 HN pts and 281 comments. The incident exposes a critical gap in Google's abuse detection and billing caps for Gemini APIs: client-side Firebase keys often have no restrictions by default, and Gemini does not enforce spending caps out of the box.
Alibaba's Qwen team releases Qwen3.6-35B-A3B as fully open-source on HuggingFace (Apache license). The model uses a Mixture-of-Experts architecture with 35B total parameters but only 3B active per token — making it runnable on consumer hardware. Simon Willison's post 'Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7' lands 404 HN pts and 84 comments, while the original release thread hits 100+ on r/LocalLLaMA. Pitched as 'agentic coding power, now open to all.'
Anthropic ships Claude Opus 4.7, its most capable Opus model yet. The release centres on long-running agentic tasks: more thinking tokens, an extended thinking mode, and increased API rate limits across all subscriber tiers to match. HN erupts with 1,752 points and 1,257 comments — the biggest AI model thread in weeks. @bcherny: 'Dogfooding Opus 4.7 the last few weeks, I've been feeling incredibly productive.' System card and model card published simultaneously.