自主进化：LLM自动优化测试时扩展策略的新框架

elvis@omarsar0

2026-05-12 07:19·33天前

AI 摘要

近期研究提出AutoTTS框架，让大语言模型自主搜索并优化测试时扩展策略，取代人工设计。该框架将宽度-深度TTS策略制定为对预收集推理轨迹的控制器合成问题，通过Beta参数化压缩搜索空间，并利用细粒度执行轨迹反馈指导探索。在数学推理基准测试中，自动发现的控制器在准确率-成本帕累托前沿上超越了人工设计的强基线，且能零样本泛化到其他基准和模型规模。整个发现过程仅需39.9美元和160分钟，预示着人工设计思维链等方法的时代可能即将结束，TTS将成为LLM自主完成的任务。

// LLMs Improving LLMs //

Interesting progress the past of couple of weeks around self-improving AI agents.

If autoresearch was interesting， you will like this read.

（bookmark it）

We've been hand-tuning test-time scaling for a year. This work asks what happens when you let an LLM search the space instead.

The paper introduces AutoTTS， a framework that reframes the human role： instead of designing branching， pruning， and stopping heuristics directly， you construct a discovery environment where TTS strategies can be searched automatically. They formulate width-depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals， so candidate controllers can be evaluated cheaply without repeated LLM calls.

Two design choices carry the search. Beta parameterization makes the control space tractable. Fine-grained execution-trace feedback tells the explorer LLM why a candidate failed， not just that it did.

On math reasoning benchmarks， the discovered controllers beat strong hand-designed baselines on the accuracy-cost Pareto frontier and generalize zero-shot to held-out benchmarks and model scales.

Entire discovery cost： $39.9 and 160 minutes.

Why it matters：

The era of researchers hand-crafting CoT， best-of-N， and self-consistency recipes is on a clock. Once the search loop is cheap enough， TTS becomes another thing LLMs do for themselves.

Paper： https：//arxiv.org/abs/2605.08083

Learn to build effective AI agents in our academy： https：//academy.dair.ai/

智能体arXiv推理论文/研究

在 X 查看原推

elvis@omarsar0 · X