<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Context Engineering on 卓琪的开发笔记</title>
    <link>https://zhuoqidev.com/tags/context-engineering/</link>
    <description>Recent content in Context Engineering on 卓琪的开发笔记</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>zh-CN</language>
    <copyright>© 2026 Liu ZhuoQi</copyright>
    <lastBuildDate>Sat, 13 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://zhuoqidev.com/tags/context-engineering/index.xml" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Claude&#39;s Tool Calling Paradigm Shift: A Deep Dive into Programmatic Tool Calling and Dynamic Filtering</title>
      <link>https://zhuoqidev.com/en/posts/claude-programmatic-tool-calling-dynamic-filter/</link>
      <pubDate>Sat, 13 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://zhuoqidev.com/en/posts/claude-programmatic-tool-calling-dynamic-filter/</guid>
      <description>&lt;h2 class=&#34;relative group&#34;&gt;Background: The Cost Problem in Agent Tool Calling&#xA;    &lt;div id=&#34;background-the-cost-problem-in-agent-tool-calling&#34; class=&#34;anchor&#34;&gt;&lt;/div&gt;&#xA;    &#xA;    &lt;span&#xA;        class=&#34;absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none&#34;&gt;&#xA;        &lt;a class=&#34;text-primary-300 dark:text-neutral-700 !no-underline&#34; href=&#34;#background-the-cost-problem-in-agent-tool-calling&#34; aria-label=&#34;Anchor&#34;&gt;#&lt;/a&gt;&#xA;    &lt;/span&gt;&#xA;    &#xA;&lt;/h2&gt;&#xA;&lt;p&gt;In traditional agent tool-calling, every tool invocation requires a full cycle of &amp;ldquo;model inference → tool execution → result return → model re-inference.&amp;rdquo; This seemingly natural loop breaks down at scale in three ways:&lt;/p&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&lt;strong&gt;Context Pollution&lt;/strong&gt;: Every tool result is injected verbatim into the context window. Fetch expense reports for 20 employees, and 2,000+ line items enter context — even though you only need to know &amp;ldquo;which 3 people exceeded their budget.&amp;rdquo;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Inference Overhead&lt;/strong&gt;: Each tool call demands a full model inference pass. Five tools = five inference passes, each costing hundreds of milliseconds to seconds.&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Noise Degrades Accuracy&lt;/strong&gt;: When the context window is packed with intermediate results, the model must find signal in noise. &lt;a href=&#34;https://arxiv.org/abs/2509&#34;  target=&#34;_blank&#34; rel=&#34;noreferrer&#34;&gt;Context Rot research&lt;/a&gt; shows LLM performance on complex tasks drops 50-70% as context grows.&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;p&gt;As Florian Bruniaux puts it in the &lt;a href=&#34;https://cc.bruniaux.com/guide/architecture/&#34;  target=&#34;_blank&#34; rel=&#34;noreferrer&#34;&gt;Claude Code Architecture Guide&lt;/a&gt;: &lt;strong&gt;&amp;ldquo;The Outer Loop — everything outside the model: context management, tool invocation, verification, memory consolidation — increasingly determines system quality more than model inference itself.&amp;rdquo;&lt;/strong&gt;&lt;/p&gt;</description>
      
    </item>
    
    <item>
      <title>RAG vs LLM Wiki vs Plain Text — A Decision Framework for Agent Long-Term Memory</title>
      <link>https://zhuoqidev.com/en/posts/memory-choice-framework/</link>
      <pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate>
      
      <guid>https://zhuoqidev.com/en/posts/memory-choice-framework/</guid>
      <description>&lt;p&gt;Every Agent builder hits this question eventually: &lt;em&gt;where do I store user data so the agent remembers it next session?&lt;/em&gt;&lt;/p&gt;&#xA;&lt;p&gt;Three approaches dominate the landscape: RAG (vector retrieval), LLM Wiki (structured knowledge injection), and plain-text context memory (the CLAUDE.md / Cursor Rules pattern). Each has vocal advocates. But picking wrong is expensive — do RAG too light and it&amp;rsquo;s a noise generator; do plain text too heavy and it&amp;rsquo;s a token incinerator.&lt;/p&gt;</description>
      
    </item>
    
    <item>
      <title>Why LLMs Have No Memory — A Research Report Covering 67 Primary Sources</title>
      <link>https://zhuoqidev.com/en/posts/llm-memory-research/</link>
      <pubDate>Mon, 04 May 2026 00:00:00 +0000</pubDate>
      
      <guid>https://zhuoqidev.com/en/posts/llm-memory-research/</guid>
      <description>&lt;p&gt;This is not AI科普. This is a cross-validated research sprint backed by &lt;strong&gt;67 primary sources&lt;/strong&gt; — vendor docs, arXiv papers, and researcher interviews — on a question every Agent builder hits: &lt;em&gt;why don&amp;rsquo;t LLMs remember anything?&lt;/em&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;→ &lt;a href=&#34;https://zhuoqidev.com/en/projects/llm-memory-research/&#34; &gt;Full report: 14-product comparison table, 9 engineering takeaways, 3-year paradigm roadmap&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&#xA;&lt;h2 class=&#34;relative group&#34;&gt;The One-Liner&#xA;    &lt;div id=&#34;the-one-liner&#34; class=&#34;anchor&#34;&gt;&lt;/div&gt;&#xA;    &#xA;    &lt;span&#xA;        class=&#34;absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none&#34;&gt;&#xA;        &lt;a class=&#34;text-primary-300 dark:text-neutral-700 !no-underline&#34; href=&#34;#the-one-liner&#34; aria-label=&#34;Anchor&#34;&gt;#&lt;/a&gt;&#xA;    &lt;/span&gt;&#xA;    &#xA;&lt;/h2&gt;&#xA;&lt;p&gt;Four independent constraints — &lt;strong&gt;O(n²) attention + KV cache VRAM + catastrophic forgetting + GDPR right-to-be-forgotten&lt;/strong&gt; — stacked together leave &amp;ldquo;stateless&amp;rdquo; as the only viable engineering solution. Every &amp;ldquo;Memory&amp;rdquo; feature you&amp;rsquo;ve seen (ChatGPT, Claude, Cursor) is &lt;strong&gt;structured text injected into the system prompt&lt;/strong&gt;. Zero weight modification. The next 1–3 years belong to &lt;strong&gt;stateless LLM kernels + stateful Agent memory layers&lt;/strong&gt;.&lt;/p&gt;</description>
      
    </item>
    
  </channel>
</rss>
