What is Treasure Hunt?

Treasure Hunt is a curated hourly news feed covering AI, Quantum computing, Cybersecurity, AI startups and Research papers. Every item is scored across seven dimensions and given an explicit trust verdict with a "why it matters" analysis.

How are stories scored?

Each post gets a 0–10 score on Coverage, Social, Novelty, Authority, Concreteness, Stakes, and FUD risk. The composite Impact score is 0.22·Stakes + 0.18·Novelty + 0.15·Authority + 0.12·Coverage + 0.12·Concreteness + 0.11·Social + 0.10·(10 − FUD risk).

Where does the data come from?

Discovery runs across 43 RSS feeds (BBC, CNN, NYT, Guardian, NPR, Bloomberg, Al Jazeera, Verge, Ars Technica, Wired, TechCrunch, MIT Tech Review, Nature, Science, arXiv), Hacker News, 15 subreddits, GDELT (100+ language news with tone scores), GitHub trending, and X trusted voices. Verification uses arXiv, Semantic Scholar for citation counts, and Marketstack for ticker reaction checks.

Stories get a FUD-risk score boosted when headlines use sensationalist framing without matching primary sources; when only one outlet covers a claim; when market-moving news fails to move the named ticker; when cited papers cannot be found on arXiv or Semantic Scholar; or when tone polarization across coverage is unusually high.

Who runs Treasure Hunt?

Treasure Hunt is assembled and maintained by Alexandru Dan (@KryptonAi). The "trusted voices" scoring weight comes from the X accounts Alexandru follows.

Treasure Hunt — The best in AI, Quantum, Cybersecurity, Startups & Research.

Research

Apr 13, 2026 · X · @hardmaru

"Neural Computer": video-generation architecture trains a world model of a real computer

Impact 6.82/10 Trust · medium 📰 5 outlets · 🐦 5,200 · 👽 r/MachineLearning · 1,400

@hardmaru (David Ha) flagged a paper adapting Sora-style video-diffusion architectures to build a learned world model of an actual Linux desktop. The model ingests 9,000 hours of screen-recording + keyboard/mouse traces and learns to predict next-frame UI state conditioned on user input — effectively a probabilistic operating-system simulator. On a held-out eval of 50 common tasks (opening files, running commands, navigating web UIs), the model achieves 73% next-event accuracy at 2-second horizons and 41% at 30-second horizons, beating the prior SOTA (Meta AI Habitat-UI) by 18pp. Direct application: train agents in fully simulated computer environments without real-system rollouts — cuts RL data costs ~40x and eliminates the safety risk of letting agents touch production systems during training.

AI

Apr 13, 2026 · EE Times (via HN)

Taking on CUDA with ROCm: 'One Step After Another'

Impact 7.15/10 Trust · high 📰 14 outlets · 🐦 3,400 · 👽 r/LocalLLaMA · 1,600

EE Times deep-dive on AMD's ROCm 7.0 and whether it can finally dent NVIDIA's CUDA moat. AMD's MI400 (96GB HBM4, 5.2 PFLOPS FP8) now runs PyTorch, vLLM and SGLang out-of-the-box — but reviewers testing MLPerf Inference v5.1 still see 1.6–2.2x gaps vs H200 on representative LLM workloads, driven by kernel-library maturity rather than raw silicon. Breakthrough of the cycle: AMD hiring 600 CUDA-kernel engineers in 12 months, plus open-sourcing HIPify tooling that auto-translates 83% of typical CUDA kernels. AMD claims Meta, Microsoft and OpenAI are all now shipping production MI400 pods. NVIDIA's response: CUDA 13 with tensor-core autotuning targeting the same eval suite, launching Q2.

AI

Apr 13, 2026 · X · @claudeai

Anthropic brings "advisor strategy" to Claude Platform: Opus advises Sonnet/Haiku at inference

Impact 7.06/10 Trust · high 📰 10 outlets · 🐦 4,600 · 👽 r/ClaudeAI · 2,700

Anthropic announced the advisor strategy on the Claude Platform: pair Opus 4.6 as a planning/critique advisor with Sonnet 4.6 or Haiku 4.5 as the executing model. The advisor inspects partial outputs, suggests corrections and redirects the executor mid-generation. On SWE-bench Multilingual, Sonnet+Opus-advisor scores 2.7 percentage points higher than Sonnet alone, at roughly 1.3x the cost vs 7x the cost of running Opus end-to-end. General availability today via the Claude Console and CLI; pricing is existing Claude API rates for both models (no advisor premium). Anthropic positions this as the first first-class multi-model inference primitive in any frontier-lab API — not just routing or cascading but explicit advisor/executor roles with shared context.

Research

Apr 13, 2026 · Techmeme

Biological Computing Company: living neurons power new AI chips and algorithms

Impact 6.8/10 Trust · medium 📰 14 outlets · 🐦 3,100 · 👽 r/Futurology · 2,100

Techmeme surfaced a profile of Biological Computing Company, a startup using real living neurons cultivated on silicon substrates to build AI accelerator chips. The company claims its wetware-on-silicon hybrid achieves 3 orders of magnitude better energy efficiency on certain pattern-recognition tasks than digital neural networks, by letting the neurons naturally perform the relevant computation in analog. Founders include neuroscientists from MIT and Caltech; early demos run on 250K-neuron arrays kept alive on nutrient channels for up to 6 months. First commercial pilots expected with a DOD-adjacent customer in 2027. Genuine neuromorphic breakthrough or hype? Independent verification still pending.

Cybersecurity

Apr 13, 2026 · Anthropic

Anthropic unveils Project Glasswing — Claude Mythos already found "thousands" of zero-days in major software

Impact 8.5/10 Trust · medium 📰 35 outlets · 🐦 18,000 · 👽 r/netsec · 4,800

Anthropic launched Project Glasswing on April 7 alongside AWS, Apple, Cisco, Google and Microsoft: a closed program distributing a restricted preview of Claude Mythos — a frontier model Anthropic says has already identified thousands of high-severity zero-day vulnerabilities across every major OS and browser. Mythos chains multiple low-severity bugs into single high-impact exploits (sometimes combining 3–5). Access is limited to ~50 partner orgs; Anthropic says the public release risk is too high. Program backed by $100M in Claude credits and $4M in open-source security donations. Sets the template for "AI that is too dangerous to ship".

AI

Apr 13, 2026 · Anthropic

Anthropic ships Claude Managed Agents — production agents without the infra work

Impact 7.8/10 Trust · high 📰 15 outlets · 🐦 14,000 · 👽 r/ClaudeAI · 3,200

Anthropic launched Claude Managed Agents, a new platform service that takes on the production-grade plumbing (task orchestration, state persistence, tool permissions, retry semantics, observability) that teams previously had to build themselves to deploy multi-step agents reliably. Boris Cherny framed it on X as removing "months of infrastructure work" from shipping a production agent. Sits alongside the broader Claude Platform — Opus-as-advisor pairings, MCP tool catalogs, and Cowork workspace — and completes the stack OpenAI, Google and Microsoft have each been racing to assemble.

Cybersecurity

Apr 12, 2026 · Reddit · r/technology

Hacker uses Claude and ChatGPT as assistant-in-the-loop to breach multiple government agencies

Impact 7.8/10 Trust · medium 📰 15 outlets · 🐦 4,800 · 👽 r/technology · 4,200

A threat-actor profile reported on r/technology and escalated across AI-security Twitter this weekend: an individual used Claude and ChatGPT as coding assistants to compose novel exploit chains against at least three US federal agencies. The attacker reportedly fed LLMs the target environment's architecture via open-source filings, had them generate bespoke phishing payloads and post-exploitation scripts, and iterated until bypasses worked. Anthropic and OpenAI have since rotated safety filters; Anthropic disclosed they had downgraded MCP cache TTL on March 6 specifically to shorten the window for adversarial prompt-cache poisoning. Sets the new baseline for "AI-assisted threat actor" reporting.

AI

Apr 12, 2026 · X

Gemma 4 crosses 10M downloads in one week; Gemma family at 500M total

Impact 7.4/10 Trust · high 📰 18 outlets · 🐦 9,200 · 👽 r/LocalLLaMA · 4,100

Sundar Pichai confirmed Gemma 4 has been downloaded 10M+ times in its first week, and the full Gemma open-weights family has now crossed 500M lifetime downloads on Hugging Face and Kaggle. Gemma 4 ships with 9B and 31B dense variants plus a 27B MoE version, all under a license permitting commercial use. Speculative-decoding benchmarks on r/LocalLLaMA report +29% average throughput and +50% on code with an E2B draft model. Reinforces Google's open-weights-parity strategy against Llama and Mistral, and makes Gemma the default choice for teams optimizing latency on open models.

Top Tweets

Apr 12, 2026 · X

Karpathy: "a growing gap in understanding AI capability" — 19.4K likes

Impact 7.5/10 Trust · high 📰 10 outlets · 🐦 58,000 · 👽 r/MachineLearning · 1,800

Andrej Karpathy posted a viral thread arguing there's a widening gap in how people perceive AI capability, driven by two factors: recency (models advance faster than any single demo captures) and tier (people who only ever used free-tier ChatGPT extrapolate its limits to frontier models). The post hit ~19,436 likes and 2,346 retweets — his biggest engagement in April. It ignited a broader thread about the need for baseline-literacy on what current-generation models can actually do, and why enterprise pilots keep under-delivering against expectations calibrated on 2023-era systems.

AI

Apr 12, 2026 · GitHub

Karpathy's nanochat hits 51.7K stars — ChatGPT clone trainable end-to-end for $100

Impact 8.2/10 Trust · high 📰 22 outlets · 🐦 14,000 · 👽 r/MachineLearning · 8,900

Andrej Karpathy's nanochat repo — a minimal, from-scratch full-stack training/inference pipeline for a ChatGPT clone — passed 51.7K GitHub stars. In ~8,000 lines of code it covers tokenizer, pretraining, SFT, RL and eval. Karpathy says you can train your own ChatGPT clone for roughly $100 of compute in four hours, and it's the capstone project for his upcoming Eureka Labs LLM101n course. llm.c (pure C/CUDA training) sits alongside at 29.5K stars. Karpathy's "make LLMs legible" mission keeps reshaping what developers build.

Top Tweets

Apr 12, 2026 · GitHub Trending

Impact 7.1/10 Trust · high 📰 8 outlets · 🐦 6,800 · 👽 r/ClaudeAI · 2,400

A community-maintained distillation of Andrej Karpathy's observations about where LLMs fail at coding — shipped as a single CLAUDE.md you drop into any Claude Code project — racked up ~5,000 stars this week, landing at #2 on GitHub trending. The repo encodes Karpathy's rules for atomic commits, test-driven scaffolding, and guarding against hallucinated APIs. Author forrestchang says it cut his own Claude Code hallucination rate by roughly half. Part of a wider trend: Karpathy-shaped opinions becoming infrastructure.

AI

Apr 12, 2026 · LLM Stats

Google ships Gemini 3.1 Ultra — 2M tokens, native multimodal, sandboxed code

Impact 8.8/10 Trust · high 📰 58 outlets · 🐦 31,000 · 👽 r/MachineLearning · 5,800

Google's marquee release of 2026: a 2M-token context window that ingests text, image, audio and video in a single forward pass — no stitched pipelines. Sundar Pichai demoed a sandboxed Code Execution tool that writes, runs and tests Python mid-conversation. On MMMU and VideoMME, Ultra outpaces GPT-5.4; on LM Arena it briefly hit #1 before GPT-5.4 reclaimed top. Available day-one in AI Studio and Vertex, with a 200K 'Flash' tier free up to 1M requests/day.

Startups

Apr 12, 2026 · Crunchbase News

OpenAI closes $122B at $852B — most valuable private company in history

Impact 7.5/10 Trust · medium 📰 70 outlets · 🐦 21,000 · 👽 r/technology · 7,100

OpenAI closed its $122B primary+secondary on March 31 at an $852B post-money, passing SpaceX to become the most valuable private company in history. D.E. Shaw and MGX co-led, with Thrive, Coatue and Temasek participating. Revenue run-rate hit $28B on the April 1 board update, up from $12B a year earlier. The round funds OpenAI's $500B Stargate commitment with Oracle and SoftBank plus a reported $70B custom-chip program with Broadcom and TSMC aimed at halving training-compute cost per token by 2027.

Research

Apr 12, 2026 · LLM Stats

DeepMind's TurboQuant: 6.2× KV-cache compression, no perplexity loss

Impact 7.8/10 Trust · high 📰 13 outlets · 🐦 4,200 · 👽 r/MachineLearning · 2,800

At ICLR 2026, DeepMind's Yury Makarychev presented TurboQuant — PolarQuant (a randomized rotation making weight distributions near-Gaussian) composed with a Quantized Johnson–Lindenstrauss projection. Together they compress the KV cache 6.2× at identical perplexity. On a Gemini 3.1 Ultra 2M-token workload, GPU memory dropped from 380GB to 62GB per request. Google says it ships in Gemini's April 18 update. On-device long-context inference suddenly looks tractable; data-center inference costs fall sharply.

AI

Apr 12, 2026 · LLM Stats

MCP crosses 97M installs; Linux Foundation takes governance at KubeCon

Impact 8.6/10 Trust · high 📰 22 outlets · 🐦 11,000 · 👽 r/programming · 4,200

Anthropic's Model Context Protocol — the open spec for wiring LLMs to tools, files and APIs — crossed 97 million installs in March, up from ~3M a year ago. Every frontier vendor now ships MCP-compatible tooling: OpenAI, Google, Mistral, xAI, Cohere. The Linux Foundation announced at KubeCon EU on April 14 that it will take MCP under open governance, with Microsoft, Red Hat and GitHub signing as founding stewards. Arguably the fastest-standardizing protocol since LSP in 2016.

Research

Apr 12, 2026 · NextBigFuture

Physics-informed transformer: 34% better RMSE, 12× faster than PINN baselines

Impact 7.6/10 Trust · high 📰 8 outlets · 🐦 1,200 · 👽 r/MachineLearning · 890

University of Hawaiʻi Manoa's Peter Sadowski published a physics-informed transformer that hard-constrains outputs to conservation laws (mass, momentum, energy) via a differentiable projection layer. On turbulent channel-flow benchmarks it beats PINN baselines by 34% RMSE at 12× faster inference. NOAA is piloting the model for 10-day regional forecasts; the DOE has it slated for next-generation fusion-plasma control. Paper in PNAS on April 5. Credible AI for climate and fusion finally looks plausible at operational latency.

AI

Apr 12, 2026 · CNBC

Meta's Muse Spark: first flagship since the $14.3B Scale AI deal

Impact 8.2/10 Trust · medium 📰 45 outlets · 🐦 9,200 · 👽 r/LocalLLaMA · 3,600

Meta Superintelligence Labs — the unit Alexandr Wang joined last July after Meta paid $14.3B for 49% of Scale AI — shipped Muse Spark, its first flagship under Wang's leadership. Training ran on ~400,000 H200s across new Louisiana and New Mexico data centers. Benchmarks show Muse Spark leading Llama 4 by 18 points on HumanEval-Plus with a 512K context. It launches as a paid Meta AI tier now, with an Apache-2.0 open-weight 'Muse Spark Mini' variant promised for Q3.

Startups

Apr 12, 2026 · Crunchbase News

Q1 2026 is the biggest venture quarter ever: $300B, 80% of it AI

Impact 7.8/10 Trust · high 📰 50 outlets · 🐦 3,100 · 👽 r/venturecapital · 1,100

Crunchbase's Q1 2026 report: $300B invested across 6,000 startups globally, up ~150% YoY — an all-time record. AI captured $242B, a full 80% of global venture funding. OpenAI's $122B primary+secondary topped the list, followed by Anthropic's $30B Series G, xAI's $20B and Waymo's $16B — the four collectively raising $188B, 65% of Q1. Beyond the frontier labs, 10+ companies raised $1B+ rounds across chips, robotics, defense, autonomous vehicles and prediction markets.

Quantum

Apr 12, 2026 · Time

AI sparks a quantum breakthrough — 'the world is not ready'

Impact 7.1/10 Trust · medium 📰 35 outlets · 🐦 18,000 · 👽 r/Physics · 4,900

Time magazine's April 7 cover story: an AI-driven advance that materially shortens the timeline to cryptographically-relevant quantum computing. Google DeepMind, in partnership with Caltech's IQIM, used a transformer trained on billions of quantum-circuit simulations to discover new error-mitigation schemes that shave an estimated 6–9 months off fault-tolerance roadmaps at IBM, Google Quantum AI and Quantinuum. Immediate consequences for cryptography, drug discovery and materials science. As one researcher put it to Time: 'the world is not ready.'

Top Tweets

Apr 12, 2026 · X / AI Noon

Karpathy's personal-LLM-plus-Obsidian thread hits 18K likes

Impact 6.7/10 Trust · high 📰 4 outlets · 🐦 52,000 · 👽 r/ObsidianMD · 3,100

Andrej Karpathy's April 8 tweet on building a personal LLM knowledge base with Obsidian hit 18,196 likes — the week's top technical post on X. His setup: a vault of ~2,800 markdown notes indexed into a vector DB, then queried by Claude via MCP. Highlights include a daily 'inbox-to-atomic-notes' agent and a 'Socratic review' agent that surfaces stale or contradictory notes. The thread ignited a broader PKM-meets-LLM conversation and turned a niche workflow into a widely-copied playbook for personal AI.

Research

Apr 12, 2026 · ScienceDaily

Tufts neurosymbolic model: 100× less energy, 7 pts better on reasoning

Impact 8.3/10 Trust · medium 📰 18 outlets · 🐦 3,400 · 👽 r/MachineLearning · 1,800

Tufts University researchers, led by Michael Hughes, published an architecture that composes dense neural networks with symbolic reasoning modules, yielding 100× lower energy consumption on ARC-AGI and math-reasoning benchmarks while improving accuracy 7 points over transformer baselines. The hybrid runs inference on a Raspberry Pi 5 at roughly GPT-3.5-equivalent reasoning quality. Paper in Nature on April 5. Immediate implications for on-device AI, battery-constrained robotics and the rising environmental cost of inference at scale.

Cybersecurity

Apr 12, 2026 · The Hacker News

Claude Code source leaks; 140 fake repos seed Vidar within hours

Impact 8.3/10 Trust · high 📰 28 outlets · 🐦 6,800 · 👽 r/netsec · 2,200

On April 4 an Anthropic engineer accidentally pushed an internal branch of Claude Code to a public GitHub fork, exposing source for ~3 hours before takedown. Within hours, threat actors seeded ~140 fake 'claude-code' and 'claude-cli' GitHub repositories using the leaked code as bait, bundling the Vidar infostealer in post-install npm hooks. Checkmarx tracked at least 1,200 malicious installs before GitHub's trust & safety team removed the repos. A textbook case of supply-chain opportunism on fresh leaked code.

AI

Apr 12, 2026 · LLM Stats

OpenAI ships GPT-5.4 — 75% on OSWorld-V, above the 72.4% human baseline

Impact 9/10 Trust · high 📰 60 outlets · 🐦 45,000 · 👽 r/OpenAI · 14,200

OpenAI shipped GPT-5.4 on April 6: a 1M-token context window, sub-200ms TTFT on short prompts, and autonomous multi-step workflow execution across software environments. On OSWorld-V — a benchmark that has the model operate a real desktop end-to-end — it scored 75%, decisively above the 72.4% human baseline. Sam Altman framed it on stage as 'AI as a reliable coworker, not a clever chat tool.' Available via API and ChatGPT Pro; a 'GPT-5.4 Mini' tier hits free users on April 20 with the same agentic scaffolding.