---
name: local-llm-context-pack
description: "Durable knowledge pack for a local Hermes/Ollama setup: system facts, preferred models, switch commands, and working conventions."
version: 1.0.0
author: Hermes Agent
license: MIT
platforms: [linux]
metadata:
  hermes:
    tags: [local-llm, hermes, ollama, knowledge-pack, context, switch]
---

# Local LLM Context Pack

Use this skill when working with the user's local model setup. It is the durable, high-signal context layer that makes the local model more useful and reduces repeated steering.

## What this pack is for

- Keep the local model aware of stable system facts.
- Capture the user's preferred switch points between hosted and local models.
- Provide a compact reference the assistant can load before local-model tasks.
- Reduce repeated explanations about ComfyUI, Ollama, Tailscale, and Hermes profiles.

## Current stable facts

- Host OS: Ubuntu 26.04.
- User home: `/home/wildlama`.
- Hermes default profile is still the hosted ChatGPT-backed profile.
- Local Hermes profile name: `local-ollama`.
- Local model: `qwen3-coder:30b-a3b-q4_K_M`.
- Local inference endpoint: `http://127.0.0.1:11434/v1`.
- ComfyUI lives at `/home/wildlama/comfy/ComfyUI`.
- ComfyUI listens on port `8188` when launched.
- Hardware: NVIDIA GeForce RTX 5090, ~32 GB VRAM.
- Ollama is installed and working locally.
- Tailscale is available and provides a private IP for the machine.

## Switch commands

Use these wrappers from the shell:

```bash
chatgpt-mini     # hosted profile / default ChatGPT-backed model
local-qwen       # local Ollama-backed profile
```

## Telegram bot split

When the user wants a real Telegram-level switch between hosted and local models, use separate gateway profiles/bots instead of trying to cram both behaviors into one bot. That keeps `/model` useful inside a profile while preserving a clean bot-per-model boundary.

See `references/telegram-bot-split.md` for the operational steps and caveats.

## Recommended workflow

When the user wants to work with a local model:

1. Confirm the local service is up.
2. Prefer the local profile for coding/debugging tasks.
3. Keep the hosted profile available for higher-quality chat/reasoning.
4. Switch profiles explicitly rather than mutating the default profile mid-session.
5. For Telegram, use one bot/profile pair per model when the user wants a true "switch" at the chat-app level.

## What to remember about the local model

- Great for coding, shell tasks, and local automation.
- It will consume a lot of VRAM when loaded.
- Performance can degrade if ComfyUI or another GPU-heavy process is active.
- If the user asks to "switch models", interpret that as a desire to move between hosted and local profiles, not to lose the hosted fallback.

## Best practices for future tasks

- Use the local model for routine or privacy-sensitive tasks.
- Use the hosted model for especially complex reasoning or when the local model struggles.
- Keep the two profiles separate.
- Don’t overwrite the default Hermes configuration unless explicitly requested.

## Maintain this pack

If a stable fact changes, patch this skill instead of re-explaining it in chat.
If a useful workflow is discovered, add it here so future sessions can reuse it.
