Meta Superintelligence Labs debuts Muse Spark — 58% on Humanity's Last Exam in Contemplating mode, 10x less compute than Llama 4 Maverick
·ai.meta.com
Meta announced Muse Spark on April 8, 2026 — the first model from the new Meta Superintelligence Labs (MSL) under Alexandr Wang. It is natively multimodal with tool-use, visual chain-of-thought, and a 'Contemplating mode' for parallel multi-agent reasoning. Benchmarks disclosed: 58% on Humanity's Last Exam and 38% on FrontierScience Research, both in Contemplating mode. Meta claims 'over an order of magnitude less compute' than Llama 4 Maverick to reach equivalent capability — roughly 10x training efficiency. Over 1,000 physicians curated health-reasoning training data. Muse Spark is proprietary (breaking from Meta's open-source stance), already live in Meta AI app and meta.ai, rolling out to WhatsApp, Instagram, Facebook, Messenger, and AI glasses.
Muse Spark is the first public output of Zuckerberg's $14B hire of Alexandr Wang and proves MSL can ship. Closed-weights is a structural break from Meta's Llama strategy — a signal that the open-source-by-default era for frontier Western labs is ending. A 10x compute-efficiency claim against Llama 4 Maverick, if it holds under independent testing, repositions Meta from catch-up to parity with Google and OpenAI.
Impact scorecard
8.7/10
Stakes
9.0
Novelty
8.0
Authority
10.0
Coverage
10.0
Concreteness
8.0
Social
8.0
FUD risk
2.0
Coverage14 outlets · 6 tier-1
ai.meta.com, about.fb.com, Bloomberg, CNBC, TechCrunch, Help Net Security, …
Reddit76 upvotes r/singularity
r/singularity, r/artificial
Trust check
high
Primary source is Meta's own blog plus corroboration from Bloomberg, CNBC, and TechCrunch (tier-1). Parameter count and training compute are not disclosed — benchmark claims come from vendor, not independent eval. Noted in fudRisk.
Kronos (AAAI 2026 accepted, arxiv 2508.02739) is the first open-source foundation model pre-trained on financial candlestick (K-line) sequences. A specialized tokenizer quantizes multi-dimensional OHLCV data into hierarchical discrete tokens; a decoder-only autoregressive transformer is pre-trained on 12B (12 billion) K-line records from 45 global exchanges. Results against the leading time-series foundation model (TSFM) and best non-pretrained baseline: 93% higher RankIC on price-series forecasting over TSFM and 87% over the non-pretrained baseline; 9% lower MAE on volatility forecasting; 22% improvement in generative fidelity for synthetic K-line sequences. Model, weights, and demo are open on GitHub (shiyu-coder/Kronos) — repo is currently GitHub-trending.
Google Research published Simula in Transactions on Machine Learning Research (April 16, 2026): a framework that reframes synthetic data generation as mechanism design, using reasoning-driven construction rather than sample-level optimization. The team (Tim R. Davidson, Benoit Seguin, Enrico Bacis, Cesar Ilharco, Hamza Harkous) generated datasets of up to 512K (512,000) data points across five domains — cybersecurity (CTI-MCQ, CTI-RCM), legal reasoning (LEXam), math (GSM8k), and multilingual knowledge (Global MMLU). Results show 'better data scales better': a 10% accuracy gain on math reasoning using Gemini 2.5 Flash as teacher and Gemma-3 4B as student. The four-step recipe is global diversification → local diversification → complexification → quality checks. Complexification helped math but hurt legal reasoning — the paper warns mechanism design is domain-dependent.
coleam00/Archon is a TypeScript open-source workflow harness that makes AI coding deterministic and repeatable through YAML-defined development processes. Hit 18.8k GitHub stars and is trending weekly. Latest release v0.3.6 on April 12, 2026 with 1,265 commits on dev branch. It ships 17 default workflows covering issue fixes, feature development, PR reviews, and refactoring. Core features: isolated execution (each run gets its own git worktree for parallel conflict-free processing), composable workflows (mix deterministic nodes like bash/tests/git with AI-powered steps like planning/code-gen/review), multi-platform (CLI, Web UI, Slack, Telegram, Discord, GitHub webhooks), and human gates (interactive approval steps). MIT licensed, requires Bun + Claude Code + GitHub CLI.