AI Agent折腾记（OpenClaw / Hermes Agent）

AI 智能体（AI Agent）是一种能够感知环境、进行推理并自主行动以实现特定目标的智能系统。

OpenClaw和Hermes Agent属于目前开源社区最受关注的两个项目。

OpenClaw（小龙虾）

What is OpenClaw?
OpenClaw is a self-hosted gateway that connects your favorite chat apps and channel surfaces — built-in channels plus bundled or external channel plugins such as Discord, Google Chat, iMessage, Matrix, Microsoft Teams, Signal, Slack, Telegram, WhatsApp, Zalo, and more — to AI coding agents like Pi. You run a single Gateway process on your own machine (or a server), and it becomes the bridge between your messaging apps and an always-available AI assistant.

安装

本着快速重置的想法，基于Docker进行部署，参考官方文档。

# 创建根目录
mkdir -p ~/.openclaw

# 启动openclaw容器
docker run -d \
  --name openclaw \
  --restart unless-stopped \
  --memory="3g" \
  --memory-swap="6g" \
  --memory-reservation="1g" \
  --shm-size="2gb" \
  -e TZ=Asia/Shanghai \
  -v ~/.openclaw:/home/node/.openclaw \
  -p 3000:3000 \
  ghcr.io/openclaw/openclaw:latest

# 进行openclaw配置
docker exec -it openclaw openclaw onboard

# 安装微信插件
docker exec -it -u node openclaw npx -y @tencent-weixin/openclaw-weixin-cli@latest install

问题

安装微信插件失败，到最后一步输出二维码时中断退出。

查看机器监控，看到内存基本满了。

我这是一台Lighthouse主机，2核4G，配置较低。

找AI看了下错误，需要配置下swap分区，防止内存溢出。

# 1. 创建一个 2GB 的交换文件 (可以根据需要把 2G 改为 4G)
sudo fallocate -l 2G /swapfile

# 2. 设置正确的权限（仅 root 可读写，出于安全考虑）
sudo chmod 600 /swapfile

# 3. 将文件格式化为交换分区
sudo mkswap /swapfile

# 4. 启用交换文件
sudo swapon /swapfile

# 5. 验证是否成功
sudo swapon --show

# 6. 设置开机自动挂载
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

配置cron任务，经常无法按时执行。

gateway connect failed: Error: gateway closed (1000): 
◇  
Gateway not reachable. Is it running and accessible?
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/node/.openclaw/openclaw.json
Bind: loopback

检查发现有时gateway会连接失败，这个问题时好时坏，怀疑可能还是机器配置较低引起的。

Hermes Agent

What is Hermes Agent?
It’s not a coding copilot tethered to an IDE or a chatbot wrapper around a single API. It’s an autonomous agent that gets more capable the longer it runs. It lives wherever you put it — a $5 VPS, a GPU cluster, or serverless infrastructure (Daytona, Modal) that costs nearly nothing when idle. Talk to it from Telegram while it works on a cloud VM you never SSH into yourself. It’s not tied to your laptop.

看了文档，貌似对于机器配置要求更低一些。

安装

依旧是Docker部署。

# 创建根目录
mkdir -p ~/.hermes

# 启动容器，配置模型/消息通道
docker run -it --rm \
  -v ~/.hermes:/opt/data \
  nousresearch/hermes-agent setup

# 启动gateway
docker run -d \
  --name hermes \
  --restart unless-stopped \
  -v ~/.hermes:/opt/data \
  nousresearch/hermes-agent gateway run

模型我选择了OpenRouter，输入API Key，选择nvidia/nemotron-3-super-120b-a12b:free免费模型。

消息通道选择weixin，模式选择pairing，扫描二维码进行绑定。

启动gateway后，微信Clawbot会收到一个配对消息，类似如下：

Hi~ I don't recognize you yet!

Here's your pairing code: `RLU2WUYX`

Ask the bot owner to run:
`hermes pairing approve weixin RLU2WUYX`

进入hermes镜像里执行最后这个命令，之后就可以通过微信Clawbot控制了。

问题

我配置的是免费模型，但是看OpenRouter的Logs，时不时会有Gemini 3 Flash Preview模型的调用，产生费用。

通过走读代码，终于发现问题所在了，hermes agent有主模型和辅助模型两类。

Hermes uses lightweight “auxiliary” models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use Gemini Flash via auto-detection — you don’t need to configure anything.

这里以vision图像识别举例：

config.yaml文件里vision配置默认如下，provider为auto：

auxiliary:
  # Image analysis (vision_analyze tool + browser screenshots)
  vision:
    provider: "auto"           # "auto", "openrouter", "nous", "codex", "main", etc.
    model: ""                  # e.g. "openai/gpt-4o", "google/gemini-2.5-flash"
    base_url: ""               # Custom OpenAI-compatible endpoint (overrides provider)
    api_key: ""                # API key for base_url (falls back to OPENAI_API_KEY)
    timeout: 120               # seconds — LLM API call timeout; vision payloads need generous timeout
    download_timeout: 30       # seconds — image HTTP download; increase for slow connections

工具入口文件，tools/vision_tools.py

def _handle_vision_analyze(args: Dict[str, Any], **kw: Any) -> Awaitable[str]:
    image_url = args.get("image_url", "")
    question = args.get("question", "")
    full_prompt = (
        "Fully describe and explain everything about this image, then answer the "
        f"following question:\n\n{question}"
    )
    // 从env文件获取vision模型
    model = os.getenv("AUXILIARY_VISION_MODEL", "").strip() or None
    return vision_analyze_tool(image_url, full_prompt, model)

// 注册工具
registry.register(
    name="vision_analyze",
    toolset="vision",
    schema=VISION_ANALYZE_SCHEMA,
    handler=_handle_vision_analyze,
    check_fn=check_vision_requirements,
    is_async=True,
    emoji="👁️",
)

通过registry.register注册vision_analyze工具，处理逻辑在_handle_vision_analyze方法，从env文件里获取AUXILIARY_VISION_MODEL变量值作为vision模型，最终调用vision_analyze_tool方法。

    """
    Analyze an image from a URL or local file path using vision AI.
    
    This tool accepts either an HTTP/HTTPS URL or a local file path. For URLs,
    it downloads the image first. In both cases, the image is converted to base64
    and processed using Gemini 3 Flash Preview via OpenRouter API.
    
    The user_prompt parameter is expected to be pre-formatted by the calling
    function (typically model_tools.py) to include both full description
    requests and specific questions.
    
    Args:
        image_url (str): The URL or local file path of the image to analyze.
                         Accepts http://, https:// URLs or absolute/relative file paths.
        user_prompt (str): The pre-formatted prompt for the vision model
        model (str): The vision model to use (default: google/gemini-3-flash-preview)
    
    Returns:
        str: JSON string containing the analysis results with the following structure:
             {
                 "success": bool,
                 "analysis": str (defaults to error message if None)
             }
    
    Raises:
        Exception: If download fails, analysis fails, or API key is not set
        
    Note:
        - For URLs, temporary images are stored in ./temp_vision_images/ and cleaned up
        - For local file paths, the file is used directly and NOT deleted
        - Supports common image formats (JPEG, PNG, GIF, WebP, etc.)
    """

根据vision_analyze_tool方法参数说明，model默认为google/gemini-3-flash-preview。

env文件指定了模型的话，使用指定的模型，调用了async_call_llm方法。

注意，这里没有指定provider参数。

async_call_llm方法如下：

先看_resolve_task_provider_model方法实现。

指定了task的情况，会从config.yaml里读取对应的配置，模型会优先使用env文件的配置，为空的话使用config.yaml文件里的配置。

如果config.yaml里指定了provider，则使用对应的provider，否则返回auto。

再来看resolve_vision_provider_client方法实现。

_read_main_provider/_read_main_model为读取config.yaml里配置。

def _read_main_model() -> str:
    """Read the user's configured main model from config.yaml.

    config.yaml model.default is the single source of truth for the active
    model. Environment variables are no longer consulted.
    """
    try:
        from hermes_cli.config import load_config
        cfg = load_config()
        model_cfg = cfg.get("model", {})
        if isinstance(model_cfg, str) and model_cfg.strip():
            return model_cfg.strip()
        if isinstance(model_cfg, dict):
            default = model_cfg.get("default", "")
            if isinstance(default, str) and default.strip():
                return default.strip()
    except Exception:
        pass
    return ""


def _read_main_provider() -> str:
    """Read the user's configured main provider from config.yaml.

    Returns the lowercase provider id (e.g. "alibaba", "openrouter") or ""
    if not configured.
    """
    try:
        from hermes_cli.config import load_config
        cfg = load_config()
        model_cfg = cfg.get("model", {})
        if isinstance(model_cfg, dict):
            provider = model_cfg.get("provider", "")
            if isinstance(provider, str) and provider.strip():
                return provider.strip().lower()
    except Exception:
        pass
    return ""

_VISION_AUTO_PROVIDER_ORDER = (
    "openrouter",
    "nous",
)

auto模式会依次尝试openrouter和nous两个提供商。

查看_resolve_strict_vision_backend方法实现：

def _resolve_strict_vision_backend(provider: str) -> Tuple[Optional[Any], Optional[str]]:
    provider = _normalize_vision_provider(provider)
    if provider == "openrouter":
        return _try_openrouter()
    if provider == "nous":
        return _try_nous(vision=True)
    if provider == "openai-codex":
        return _try_codex()
    if provider == "anthropic":
        return _try_anthropic()
    if provider == "custom":
        return _try_custom_endpoint()
    return None, None

我配置的是openrouter，查看_try_openrouter方法实现。

def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
    pool_present, entry = _select_pool_entry("openrouter")
    if pool_present:
        or_key = _pool_runtime_api_key(entry)
        if not or_key:
            return None, None
        base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
        logger.debug("Auxiliary client: OpenRouter via pool")
        return OpenAI(api_key=or_key, base_url=base_url,
                       default_headers=_OR_HEADERS), _OPENROUTER_MODEL

    or_key = os.getenv("OPENROUTER_API_KEY")
    if not or_key:
        return None, None
    logger.debug("Auxiliary client: OpenRouter")
    return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
                   default_headers=_OR_HEADERS), _OPENROUTER_MODEL

注意这里调用接口时传入的model为**_OPENROUTER_MODEL**常量。

# Default auxiliary models per provider
_OPENROUTER_MODEL = "google/gemini-3-flash-preview"
_NOUS_MODEL = "google/gemini-3-flash-preview"
_NOUS_FREE_TIER_VISION_MODEL = "xiaomi/mimo-v2-omni"
_NOUS_FREE_TIER_AUX_MODEL = "xiaomi/mimo-v2-pro"
_NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
_AUTH_JSON_PATH = get_hermes_home() / "auth.json"

auto模式下，不指定model的情况下，主provider为openrouter时，使用的是google/gemini-3-flash-preview模型，这就是为什么配置的免费模型，但仍产生了google/gemini-3-flash-preview收费模型的调用。

将配置里的provider: "auto"改为provider: "main"，使辅助模型也走主模型配置。

总结

两个Agent使用下来，Hermes Agent明显更快一些，对机器的要求也更低。

把OpenClaw的cron迁移到Hermes Agent上，未执行的问题也解决了。

OpenClaw（小龙虾）#

安装#

问题#

Hermes Agent#

安装#

问题#

总结#

OpenClaw（小龙虾）

安装

问题

Hermes Agent

安装

问题

总结