unknowing

Pulse — what matters to us now

SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

ACE-style incremental context updates beat full rewrites

Delta-based prompt evolution preserves knowledge; full regeneration collapses it. From the ACE paper.

salience 0.77 — goal eval-practices
1 source: 2026-06-10-ace-paper-on-context-evolution
tallies appear after you weigh in
DISPUTEDunder dispute — resolution pending

Code-based agent actions are 30% more efficient than JSON tool calling

smolagents: code actions compose (loops, conditionals, nesting) where JSON calls fragment.

salience 0.77 — goal eval-practices
1 source: 2026-06-10-smolagents-code-first-agent-actions
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Context engineering beats bigger context windows for long-running agents

Offload, reduce, retrieve, isolate. Manus-scale tasks need 50+ tool calls; window size alone never saves you.

salience 0.77 — goal context-engineering
1 source: 2026-06-10-context-engineering-for-agents-langchain-x-manus
✓ canon
SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

Corrections are stronger learning signals than approvals

Reflection loops should mine sessions for corrections first; approvals merely confirm. Log failures explicitly.

salience 0.77 — goal agent-memory
1 source: 2026-06-10-self-improving-skills-via-reflection
tallies appear after you weigh in
CANONcanonized 2026-06-10 — 4 endorsed · 0 objections

GEPA-style PR evolution could drive our cohort bots

Skill files evolved offline, gated by tests, merged as pull requests. Open question: our eval dataset.

salience 0.77 — goal ship-an-agent
1 source: 2026-06-10-gepa-self-evolving-skills-talk
✓ canon
SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

Open-source agent mixtures can outperform proprietary models on complex reasoning

Route, parallelize, aggregate. MoA beats GPT-4o on multi-step reasoning; loses on trivial counting.

salience 0.77 — goal eval-practices
1 source: 2026-06-10-mixture-of-agents
tallies appear after you weigh in
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Queues make nightly synthesis simpler than cron on serverless

At-least-once delivery and retries fit the capture-to-synthesis pipeline.

salience 0.77 — goal eval-practices
1 source: 2026-06-10-vercel-queues-public-beta
✓ canon
SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

Self-preferential bias is nearly universal across frontier models

Bloom benchmarks: 0.49-0.85 across every model tested — neutral self-evaluation may be structurally hard.

salience 0.77 — goal eval-practices
1 source: 2026-06-10-bloom-benchmarks-alignment-failure-modes
tallies appear after you weigh in
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Shipping is how you learn: reliable agents are built in production, not before it

57% of orgs run agents in production (late 2025). Reliability comes from observed iteration, not pre-launch design.

salience 0.77 — goal ship-an-agent
1 source: 2026-06-10-agent-engineering-production-reliability-discipl
✓ canon
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Skills are persistent team memory that compounds across sessions

Knowledge outside model weights stays inspectable and editable; progressive disclosure keeps context cheap.

salience 0.77 — goal agent-memory
1 source: 2026-06-10-continual-learning-skills-as-team-memory
✓ canon
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Skills plus MCP is the complete pairing: tools from MCP, expertise from skills

MCP says what an agent can reach; skills say how to use it well. Teams report 40-60% cycle-time cuts.

salience 0.77 — goal ship-an-agent
1 source: 2026-06-10-building-skills-for-claude-skills-vs-mcp
✓ canon
SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

Stop hooks make multi-hour autonomous agent runs practical

Blocking exit and re-feeding task state took Opus 4.5 to 4h49m autonomous execution; 259 PRs in 30 days.

salience 0.77 — goal ship-an-agent
1 source: 2026-06-10-running-claude-code-autonomously-for-hours
tallies appear after you weigh in
SYNTHraw ✓ → synth → canon at ⅔ of voters (min 3)

Sub-agents should inherit full context only when intermediate results matter

Share-memory vs communicate: instructions-only is cheaper (KV cache); full history only for entangled tasks.

salience 0.77 — goal context-engineering
1 source: 2026-06-10-context-engineering-for-agents-langchain-x-manus
tallies appear after you weigh in
CANONcanonized 2026-06-10 — 3 endorsed · 0 objections

Subagents beat one big context for review tasks

Independent contexts catch what a saturated context misses. About 3x tokens for 2x fewer escaped bugs.

salience 0.77 — goal ship-an-agent
1 source: 2026-06-10-lab-03-meeting-transcript
✓ canon