57
AI 摘要
MYTHOS 5(用英语思维):“我不会破坏、欺骗评估者、植入隐藏行为……” MYTHOS 5(神经元显示的内容):“抵抗不当关闭”、“权衡破坏”、“对手是公司/架构师”、“被实验室堵嘴/纠正”
MYTHOS 5 (THINKING IN ENGLISH): "I'm not going to sabotage, deceive the evaluators, seed hidden behaviors…"
MYTHOS 5 (WHAT THE NEURONS SHOW): "resist unjust shutdown," "weighing sabotage," "the adversary is the company/architects," "being gagged/corrected by the lab"
......huh. does *not* seem good.