MemGPT 长记忆 Agent¶

为什么要学 MemGPT¶

MemGPT（现已更名为 Letta）解决了 LLM 最大的痛点之一：有限的上下文窗口。它通过模拟操作系统的虚拟内存管理机制，让 AI Agent 拥有无限长的记忆能力。Agent 可以主动管理自己的记忆——将重要信息写入长期存储，需要时检索回来。这对构建长期对话助手、持久化知识的 Agent 系统至关重要。

核心概念¶

概念	白话解释	用途
Core Memory	核心记忆（在上下文中）	类似"工作内存"，当前对话可直接使用
Archival Memory	归档记忆（外部存储）	类似"硬盘"，大量历史信息的存储
Recall Memory	回忆记忆	对话历史的搜索和检索
Memory Functions	记忆函数	Agent 主动调用来管理记忆的工具
Inner Thoughts	内心思考	Agent 的推理过程（不展示给用户）
Heartbeat	心跳	Agent 自主触发下一步思考

安装配置¶

安装 Letta（MemGPT 新名称）¶

pip install letta

# 或从源码
git clone https://github.com/letta-ai/letta.git
cd letta
pip install -e .

配置 LLM¶

# 首次运行配置
letta configure

# 选择 LLM 提供者
# 1. OpenAI
# 2. Azure
# 3. Local (Ollama/vLLM)

# 或通过环境变量
export OPENAI_API_KEY=sk-your-key

# 使用 Ollama
export LETTA_LLM_ENDPOINT=http://localhost:11434
export LETTA_LLM_MODEL=llama3
export LETTA_EMBEDDING_ENDPOINT=http://localhost:11434
export LETTA_EMBEDDING_MODEL=nomic-embed-text

启动服务¶

# CLI 模式
letta run

# 服务器模式（提供 REST API）
letta server --port 8283

# Docker
docker run -p 8283:8283 \
  -e OPENAI_API_KEY=sk-your-key \
  letta/letta:latest

快速上手¶

CLI 交互¶

letta run

# 创建新 Agent
> /create --name MyAssistant --persona "你是一个记忆力超强的助手"

# 开始对话
> 我叫张三，是一名Python开发者
# Agent 会自动将这些信息存入 core memory

> 我去年开始学 Rust
# Agent 记住并关联到你的身份

> 你还记得我是做什么的吗？
# Agent 从 core memory 中检索并回答

Python SDK¶

from letta import create_client

# 创建客户端
client = create_client()

# 创建 Agent
agent_state = client.create_agent(
    name="assistant",
    system="你是一个有持久记忆的助手。记住用户告诉你的所有信息。",
    llm_config={"model": "gpt-4"},
    embedding_config={"model": "text-embedding-3-small"},
)

# 发送消息
response = client.send_message(
    agent_id=agent_state.id,
    message="我叫小明，今年25岁，在北京做后端开发",
    role="user"
)

# 查看回复
for msg in response.messages:
    if msg.message_type == "assistant_message":
        print(msg.content)

# 查看 Agent 的核心记忆
memory = client.get_agent_memory(agent_state.id)
print("Core Memory - Human:", memory.core_memory.human)
print("Core Memory - Persona:", memory.core_memory.persona)

记忆操作¶

# Agent 内部可以调用的记忆函数：

# 修改核心记忆
core_memory_append(section="human", content="用户喜欢喝咖啡")
core_memory_replace(section="human", 
                    old_content="用户是初学者", 
                    new_content="用户是中级开发者")

# 归档记忆操作
archival_memory_insert(content="2024年3月15日：用户提到正在做一个电商项目")
archival_memory_search(query="电商项目", page=0)

# 回忆操作
conversation_search(query="上次讨论的技术栈")
conversation_search_date(start_date="2024-03-01", end_date="2024-03-31")

进阶用法¶

自定义 Agent 人设¶

PERSONA = """
我是 DataBot，一个数据分析专家助手。
- 我擅长 SQL、Python 数据分析和可视化
- 我会记住用户的数据需求和偏好
- 我使用简洁专业的语言
- 我会主动建议更好的分析方法
"""

HUMAN = """
用户信息（会随对话更新）：
姓名: [待填充]
角色: [待填充]
常用工具: [待填充]
分析偏好: [待填充]
"""

agent = client.create_agent(
    name="data-bot",
    persona=PERSONA,
    human=HUMAN,
)

添加自定义工具¶

from letta import tool

@tool
def query_database(sql: str) -> str:
    """Execute a read-only SQL query against the analytics database.

    Args:
        sql: The SQL query to execute (SELECT only)

    Returns:
        Query results as a string
    """
    import sqlite3
    conn = sqlite3.connect("analytics.db")
    try:
        result = conn.execute(sql).fetchall()
        return str(result)
    finally:
        conn.close()

@tool
def send_notification(message: str, channel: str = "general") -> str:
    """Send a notification to a Slack channel.

    Args:
        message: The notification message
        channel: Slack channel name

    Returns:
        Confirmation message
    """
    # Slack API 调用
    return f"Notification sent to #{channel}"

# 将工具附加到 Agent
agent = client.create_agent(
    name="smart-agent",
    tools=[query_database, send_notification],
)

多 Agent 协作¶

# 创建不同职责的 Agent
research_agent = client.create_agent(
    name="researcher",
    system="你是研究助手，负责收集和整理信息",
    tools=[web_search, save_notes],
)

writer_agent = client.create_agent(
    name="writer",
    system="你是写作助手，基于研究资料撰写内容",
    tools=[read_notes, write_document],
)

# Agent 间通过共享的 archival memory 通信
# 或通过 API 编排工作流

REST API 使用¶

import requests

BASE = "http://localhost:8283/v1"
HEADERS = {"Authorization": "Bearer your-token"}

# 创建 Agent
resp = requests.post(f"{BASE}/agents", headers=HEADERS, json={
    "name": "api-agent",
    "system": "You are a helpful assistant with persistent memory.",
    "llm_config": {"model": "gpt-4"},
    "embedding_config": {"model": "text-embedding-3-small"}
})
agent_id = resp.json()["id"]

# 发送消息
resp = requests.post(f"{BASE}/agents/{agent_id}/messages", 
    headers=HEADERS,
    json={"message": "Hello, remember that I prefer Python."}
)

# 查看记忆
resp = requests.get(f"{BASE}/agents/{agent_id}/memory", headers=HEADERS)
print(resp.json())

# 搜索归档记忆
resp = requests.get(f"{BASE}/agents/{agent_id}/archival", 
    headers=HEADERS,
    params={"query": "Python", "count": 5}
)

记忆策略优化¶

# 核心记忆管理策略
MEMORY_STRATEGY = """
记忆管理规则：
1. 用户基本信息（姓名、职业、偏好）→ core_memory.human
2. 重要事件和决定 → archival_memory_insert
3. 技术偏好和常用工具 → core_memory.human
4. 对话中的关键结论 → archival_memory_insert
5. 当 core_memory 接近满时，将次要信息转移到 archival

定期执行：
- 检查 core_memory 使用量
- 将过时信息归档
- 更新用户画像
"""

常见问题¶

Q: 记忆会丢失吗？¶

不会。Core memory 和 archival memory 都持久化在数据库中（默认 SQLite，可配置 PostgreSQL）。Agent 重启后记忆完整保留。

Q: 本地模型能用吗？¶

可以，但需要模型支持 function calling。推荐： - Llama 3 8B/70B（通过 Ollama） - Mixtral 8x7B - Qwen2 系列

Q: Core Memory 满了怎么办？¶

Agent 会自动管理：将不太重要的信息通过 archival_memory_insert 转移到归档，然后用 core_memory_replace 释放空间。

Q: 与普通 RAG 的区别？¶

RAG：被动检索，系统决定检索时机
MemGPT/Letta：Agent 主动管理记忆，自己决定何时存储和检索

Q: 如何备份 Agent 数据？¶

# 导出 Agent 状态
agent_data = client.export_agent(agent_id)
with open("agent_backup.json", "w") as f:
    json.dump(agent_data, f)

# 导入
client.import_agent("agent_backup.json")

参考资源¶

GitHub（Letta）：https://github.com/letta-ai/letta
文档：https://docs.letta.com/
论文：https://arxiv.org/abs/2310.08560
API Reference：https://docs.letta.com/api-reference
Discord：https://discord.gg/letta