SkillClaw: multi-user LLM agent ecosystems where skills evolve across all users — 276 HuggingFace upvotes in one week
·arXiv / Huazhong University + Alibaba
SkillClaw (arXiv 2604.08377, April 9 2026) introduces a framework where deployed LLM agent skills improve themselves by aggregating real interactions across all users simultaneously. An autonomous evolver identifies recurring behavioral patterns in cross-user trajectories, distills improvements, and propagates them system-wide — so a fix discovered in one user's session benefits everyone. Evaluated on WildClawBench with Qwen3-Max, the system shows measurable gains from limited interaction data. The paper attracted 276 upvotes on Hugging Face in its first week — among the highest engagement of any April 2026 AI paper.
Today's AI agents are static after deployment — every user starts from scratch. SkillClaw's collective evolution model is a step toward agents that genuinely learn from deployment at scale, similar to how human institutions transmit knowledge. If this generalises beyond the benchmark, it changes how agent platforms are designed: the value compounds with every user interaction rather than decaying.
Impact scorecard
7.29/10
Stakes
8.0
Novelty
9.0
Authority
6.0
Coverage
5.0
Concreteness
6.0
Social
9.0
FUD risk
3.0
Coverage3 outlets · 0 tier-1
arXiv, Hugging Face Papers (276 upvotes)
Reddit0 upvotes r/MachineLearning
r/MachineLearning, r/singularity
Trust check
medium
arXiv preprint (2604.08377), listed as work in progress. Strong HF community signal (276 upvotes) but no peer-review yet. Benchmark results on WildClawBench are internally evaluated — independent replication pending. Author affiliations (Huazhong University, Alibaba) are credible but the framework is early-stage.
Tencent Robotics X and HY Vision Team released HY-Embodied-0.5, a family of open-source foundation models built for real-world robotic agents (arXiv 2604.07430). The 2B model targets edge devices and leads state-of-the-art alternatives on 16 of 22 benchmarks; the 32B variant matches Gemini 3.0 Pro on embodied understanding tasks. Both use a Mixture-of-Transformers (MoT) architecture with modality-specific computing paths and latent tokens for fine-grained visual perception — critical for manipulation and navigation. A VLA (Vision-Language-Action) model trained on this foundation enables real-world robot control. Full code and model weights are public on April 7, 2026.
Stanford's 2026 AI Index, released April 16, shows China has nearly erased the US lead: the top model Arena score gap collapsed from 1300+ points (May 2023) to just 39 (March 2026). Meanwhile, AI scholars emigrating to the US dropped 89% since 2017, accelerating 80% in the past year. China leads in industrial robot installations (295,000 vs 34,200 US) and AI research citations (20.6% vs 12.6%). US private AI investment reached $285.9B in 2025 vs China's $12.4B — but the money gap isn't translating into capability dominance.
NousResearch's hermes-agent (github.com/NousResearch/hermes-agent) has reached 96,216 stars and 13,481 forks, ranking among the most-starred AI agent projects on GitHub. Unlike most frameworks, it implements a persistent learning loop: the agent creates skills from experience, refines them during use, and searches its own conversation history. It runs across 200+ models (OpenRouter, OpenAI, Anthropic, custom), deploys via Terminal, Telegram, Discord, Slack, WhatsApp or Email, and operates on a $5 VPS or GPU cluster. It includes a cron scheduler for unattended operation and batch trajectory generation for RL training.