From the README

7/24 Office -- Self-Evolving AI Agent System

A production-running AI agent built in ~10,000 lines of pure Python with zero framework dependency. No LangChain, no LlamaIndex, no CrewAI -- just the standard library + a few small packages.

36 tools. 20 files. Modular architecture. Runs 24/7.

Built solo with AI co-development tools. Production 24/7 across multiple users.

What's New in v2.0

Modular tool architecture -- Split from monolithic tools.py into 7 domain modules
Group chat support -- Independent container for group conversations with @-mention gating
AI Mirror -- Behavioral profile reports (soul_report) + future-self dialogue mode (future_self)
Nudge system -- Structural behavior correction: auto-detects when LLM has tools but doesn't use them
Dynamic tool filtering -- 5 context profiles (voice/scheduler/group/diagnostic/default) to reduce token waste
Budget-aware system prompt -- Token budget tracking during system prompt assembly
Inactivity guard -- Auto-skip cron tasks for dormant users (3-day threshold)
Circuit breaker -- Disable tools after 3 consecutive failures per session
Interactive visualization -- ECharts-based HTML pages via render_page (line/bar/pie/radar/table/report)
Container reconciliation -- Router auto-rebuilds missing containers from routing table on startup
Exponential backoff retry -- Messaging API calls retry 3x with 2/4/8s delays
Session auto-archiving -- Daily black box recording of all conversations

Features

Tool Use Loop -- OpenAI-compatible function calling with automatic retry, up to 20 iterations per conversation
Three-Layer Memory -- Session history + LLM-compressed long-term memory + LanceDB vector retrieval
MCP/Plugin System -- Connect external MCP servers via JSON-RPC (stdio or HTTP), hot-reload without restart
Runtime Tool Creation -- The agent can write, save, and load new Python tools at runtime (create_tool)
Self-Repair -- Daily self-check, session health diagnostics, error log analysis, auto-notification on failure
Cron Scheduling -- One-shot and recurring tasks, persistent across restarts, timezone-aware, inactivity guard
Multi-Tenant Router -- Docker-based auto-provisioning, one container per user, health-checked, reconciliation
Multimodal -- Image/video/file/voice/link handling, ASR (speech-to-text), vision via base64
Web Search -- Multi-engine (Tavily, Bocha, GitHub, HuggingFace) with auto-routing and dual-engine default
Video Processing -- Trim (with intelligent silence detection), add BGM, AI video generation via API
Messaging Integration -- Pluggable messaging platform with debounce, message splitting, streaming media upload
Group Chat -- Independent container, @-mention gating, context buffer (last 20 messages), speaker identification

Architecture

                    +-----------------+
                    |  Messaging      |
                    |  Platform       |
                    +--------+--------+
                             |
                    +--------v--------+
                    |   router.py     |  Multi-tenant routing
                    |  Auto-provision |  Container reconciliation
                    |  Group routing  |  Health checking
                    +--------+--------+
                             |
                    +--------v--------+
                    | xiaowang.py     |  Entry point
                    |  HTTP server    |  Callback handling
                    |  Debounce       |  Media download/ASR
                    |  Group support  |  Inactivity tracking
                    +--------+--------+
                             |
                    +--------v--------+
                    |    llm.py       |  Tool Use Loop (core)
                    |  Budget-aware   |  Session management
                    |  system prompt  |  Nudge integration
                    |  Multimodal     |  Memory injection
                    +--------+--------+
                             |
     +----------+----+------+------+----+-----------+
     |          |    |             |    |           |
+----v-----+ +-v----v--+ +-------v-+ +-v--------+ |
| tools_    | |tools_   | |tools_   | |tools_    | |
| messaging | |admin    | |search   | |video     | |
| send/file | |exec/diag| |web/mem  | |trim/bgm  | |
| schedule  | |plugin   | |recall   | |generate  | |
+-----------+ |MCP      | +---------+ +----------+ |
              +----+----+                           |
              +----v----+  +----------+  +----------v--+
              |tools_   |  |tools_    |  |  nudge.py   |
              |page     |  |mirror    |  |  5 rules    |
              |ECharts  |  |soul rpt  |  |  auto-hint  |
              |6 types  |  |future    |  +-------------+
              +---------+  |self      |
                           +----------+
              +--------------+--------------+
              |              |              |
       +------v------+  +---v--------+  +--v-----------+
       | memory.py   |  |scheduler.py|  | archive.py   |
       | 3-layer     |  | cron+once  |  | daily black  |
       | compress    |  | inactivity |  | box recorder |
       | deduplicate |  | guard      |  +--------------+
       | retrieve    |  +------------+
       +------+------+
              |
       +------v------+
       |mcp_client.py|  JSON-RPC over stdio/HTTP
       | Auto-reconnect + Hot-reload
       +-------------+

Memory System

Layer 1: Session (short-term)
  - Last 40 messages per session, JSON files
  - Overflow triggers compression
  - Auto-archive sessions >100KB

Layer 2: Compressed (long-term)
  - LLM extracts structured facts from evicted messages
  - Deduplication via cosine similarity (threshold: 0.92)
  - Stored as vectors in LanceDB

Layer 3: Retrieval (active recall)
  - User message -> embedding -> vector search
  - Top-K relevant memories injected into system prompt
  - Budget-aware injection (tracks token usage)

Tool List (36 built-in)

Category	Module	Tools
Core	`tools_admin`	`exec`, `message`
Files	`tools_admin`	`read_file`, `write_file`, `edit_file`, `list_files`
Scheduling	`tools_messaging`	`schedule`, `list_schedules`, `remove_schedule`
Media Send	`tools_messaging`	`send_image`, `send_file`, `send_video`, `send_link`, `send_location`, `send_namecard`
Video	`tools_video`	`trim_video` (auto-cut silence), `add_bgm`, `generate_video`
Search	`tools_search`	`web_search` (Tavily+Bocha dual-engine), `search_nearby` (geocoding+POI), `search_memory`, `recall`
Visualization	`tools_page`	`render_page` (line/bar/pie/radar/table/report via ECharts)
AI Mirror	`tools_mirror`	`soul_report` (behavioral profile HTML), `future_self` (dialogue mode)
Diagnostics	`tools_admin`	`self_check`, `diagnose`, `task_history`, `code_audit`, `asr_check`, `daily_digest`
Memory	`tools_admin`	`compact_memory`, `compact_guides`
Plugins	`tools_admin`	`create_tool`, `list_custom_tools`, `remove_tool`
MCP	`tools_admin`	`reload_mcp`

Module Structure

File	Lines	Responsibility
`xiaowang.py`	~1040	Entry: config, HTTP server, callbacks, debounce, ASR, group support
`llm.py`	~1260	LLM API + tool use loop + budget-aware system prompt + nudge integration
`tools.py`	~37	Orchestration layer (imports domain modules)
`tools_base.py`	~314	Registry + @tool decorator + dynamic filtering + circuit breaker
`tools_messaging.py`	~550	Message/file/schedule/location/namecard tools
`tools_admin.py`	~860	Exec/file ops/diagnostics/plugins/MCP management
`tools_mirror.py`	~716	AI Mirror: soul_report + future_self
`tools_page.py`	~470	Interactive HTML page generation (ECharts)
`tools_search.py`	~293	Multi-engine web search + memory search
`tools_video.py`	~394	Video trim/BGM/generation
`messaging.py`	~447	Messaging platform API wrapper + CDN upload/download
`memory.py`	~1100	Three-layer memory (session + compressed + vector)
`scheduler.py`	~652	Cron + one-shot scheduling + inactivity guard
`router.py`	~500+	Multi-tenant Docker router + auto-provisioning + reconciliation
`mcp_client.py`	~342	MCP protocol client (JSON-RPC, zero new deps)
`nudge.py`	~190	Nudge system: detect tool misuse, auto-inject hints
`archive.py`	~204	Daily session archiving (black box recorder)
`audit.py`	~448	Automated 11-check code audit

Quick Start

Option 1: Direct Run

git clone https://github.com/wangziqi06/724-office.git
cd 724-office
cp config.example.json config.json
# Edit config.json with your API keys

pip install croniter lancedb websocket-client pilk numpy httpx beautifulsoup4 pydub jieba fpdf2

mkdir -p workspace/memory workspace/files
python3 xiaowang.py

Option 2: Docker Deployment (Recommended)

# Copy Dockerfile.example -> Dockerfile
# Copy docker-compose.example.yml -> docker-compose.yml
# Edit .env with your credentials

docker compose build
docker compose up -d

The agent starts an HTTP server on port 8080. Point your messaging platform webhook to http://YOUR_SERVER:8080/.

Configuration

See config.example.json for the full configuration structure. Key sections:

models -- LLM providers (any OpenAI-compatible API) with fallback chain
messaging -- Messaging platform credentials and endpoints
memory -- Three-layer memory system settings (embedding API, similarity threshold)
asr -- Speech-to-text API credentials
video_api -- AI video generation API
mcp_servers -- MCP server connections (stdio or HTTP transport)
page_base_url -- Base URL for generated visualization pages

Design Principles

Zero framework dependency -- Every line is visible and debuggable. No magic. No hidden abstractions.
Modular tools -- Adding a capability = adding one @tool-decorated function in the appropriate domain module.
Edge-deployable -- Designed to run on Jetson Orin Nano (8GB RAM, ARM64). RAM budget under 2GB.
Self-evolving -- The agent can create new tools at runtime, diagnose its own issues, and notify the owner.
Structural behavior correction -- Don't fix agent mistakes with prompts. Add nudges, hooks, and validation.
Build for deletion -- Every module should be cleanly removable when the model gets smarter.
Context is the scarcest resource -- Token budget is the core design constraint, not compute.