mono/packages/kbot/docs/ipc.md
2025-09-17 19:41:09 +02:00

236 lines
7.3 KiB
Markdown

# IPC Communication Documentation
## Overview
This document describes the Inter-Process Communication (IPC) system between the `images.ts` command and the Tauri GUI application.
## Current Architecture
### Components
1. **images.ts** - Node.js CLI command process
2. **tauri-app.exe** - Tauri desktop application (Rust + Web frontend)
3. **IPC Client** - Node.js library for managing communication
4. **Tauri Commands** - Rust functions exposed to frontend
5. **React Frontend** - TypeScript/React UI
## Communication Flows
### 1. Initial Configuration Passing
```mermaid
sequenceDiagram
participant CLI as images.ts CLI
participant IPC as IPC Client
participant Tauri as tauri-app.exe
participant Frontend as React Frontend
participant Rust as Tauri Rust Backend
CLI->>IPC: createIPCClient()
CLI->>IPC: launch([])
IPC->>Tauri: spawn tauri-app.exe
Note over CLI,Rust: Initial data sending
CLI->>IPC: sendInitData(prompt, dst, apiKey, files)
IPC->>Tauri: stdout: {"type":"init_data","data":{...}}
Tauri->>Frontend: IPC message handling
Frontend->>Frontend: setPrompt(), setDst(), setApiKey()
Note over CLI,Rust: Image data sending
CLI->>IPC: sendImageMessage(base64, mimeType, filename)
IPC->>Tauri: stdout: {"type":"image","data":{...}}
Tauri->>Frontend: IPC message handling
Frontend->>Frontend: addFiles([{path, src}])
```
### 2. GUI to CLI Messaging (Current Implementation)
```mermaid
sequenceDiagram
participant Frontend as React Frontend
participant Rust as Tauri Rust Backend
participant Tauri as tauri-app.exe
participant IPC as IPC Client
participant CLI as images.ts CLI
Note over Frontend,CLI: User sends message from GUI
Frontend->>Frontend: sendMessageToImages()
Frontend->>Rust: safeInvoke('send_message_to_stdout', message)
Rust->>Rust: send_message_to_stdout command
Rust->>Tauri: println!(message) to stdout
Tauri->>IPC: stdout data received
IPC->>IPC: parse JSON from stdout
IPC->>CLI: handleMessage() callback
CLI->>CLI: gui_message handler
Note over Frontend,CLI: Echo response
CLI->>IPC: sendDebugMessage('Echo: ...')
IPC->>Tauri: stdout: {"type":"debug","data":{...}}
Tauri->>Frontend: IPC message handling
Frontend->>Frontend: addDebugMessage()
```
### 3. Console Message Forwarding
```mermaid
sequenceDiagram
participant Frontend as React Frontend
participant Console as Console Hijack
participant Rust as Tauri Rust Backend
participant CLI as images.ts CLI
Note over Frontend,CLI: Console messages forwarding
Frontend->>Console: console.log/error/warn()
Console->>Console: hijacked in main.tsx
Console->>Rust: safeInvoke('log_error_to_console')
Rust->>Rust: log_error_to_console command
Rust->>CLI: eprintln! to stderr
CLI->>CLI: stderr logging
```
## Current Issues & Complexity
### Problem 1: Multiple Communication Channels
We have **3 different communication paths**:
1. **IPC Messages** (structured): `{"type": "init_data", "data": {...}}`
2. **Raw GUI Messages** (via Tauri command): `{"message": "hello", "source": "gui"}`
3. **Console Forwarding** (via hijacking): All console.* calls
### Problem 2: Inconsistent Message Formats
- **From CLI to GUI**: Structured IPC messages
- **From GUI to CLI**: Raw JSON via stdout
- **Console logs**: String messages via stderr
### Problem 3: Complex Parsing Logic
The IPC client has to handle multiple message formats:
```typescript
// Structured IPC message
if (parsed.type && parsed.data !== undefined) {
this.handleMessage(parsed as IPCMessage);
}
// Raw GUI message
else if (parsed.message && parsed.source === 'gui') {
const ipcMessage: IPCMessage = {
type: 'gui_message',
data: parsed,
// ...
};
this.handleMessage(ipcMessage);
}
```
## Recommended Simplification
### Option 1: Unified IPC Messages
**All communication should use the same format:**
```typescript
interface IPCMessage {
type: 'init_data' | 'gui_message' | 'debug' | 'image' | 'prompt_submit';
data: any;
timestamp: number;
id: string;
}
```
**Sequence:**
```mermaid
sequenceDiagram
participant Frontend as React Frontend
participant Rust as Tauri Rust Backend
participant CLI as images.ts CLI
Note over Frontend,CLI: Unified messaging
Frontend->>Rust: safeInvoke('send_ipc_message', {type, data})
Rust->>CLI: stdout: {"type":"gui_message","data":{...},"timestamp":...}
CLI->>Rust: stdout: {"type":"debug","data":{...},"timestamp":...}
Rust->>Frontend: handleMessage(message)
```
### Option 2: Direct Tauri IPC (Recommended)
**Use Tauri's built-in event system:**
```mermaid
sequenceDiagram
participant Frontend as React Frontend
participant Rust as Tauri Rust Backend
participant CLI as images.ts CLI
Note over Frontend,CLI: Tauri events
Frontend->>Rust: emit('gui-message', data)
Rust->>CLI: HTTP/WebSocket/Named Pipe
CLI->>Rust: HTTP/WebSocket/Named Pipe response
Rust->>Frontend: emit('cli-response', data)
```
## Current File Structure
```
src/
├── lib/ipc.ts # IPC Client (Node.js side)
├── commands/images.ts # CLI command with IPC integration
gui/tauri-app/
├── src/App.tsx # React frontend with IPC handling
├── src/main.tsx # Console hijacking setup
└── src-tauri/src/lib.rs # Tauri commands and state management
```
## Configuration Passing Methods
### Method 1: CLI Arguments (Original)
```bash
tauri-app.exe --api-key "key" --dst "output.png" --prompt "text" file1.png file2.png
```
### Method 2: IPC Messages (Current)
```typescript
ipcClient.sendInitData(prompt, dst, apiKey, files);
```
### Method 3: Environment Variables
```bash
export API_KEY="key"
export DST="output.png"
tauri-app.exe
```
### Method 4: Temporary Config File
```typescript
// Write config.json
fs.writeFileSync('/tmp/config.json', JSON.stringify({prompt, dst, apiKey}));
// Launch app
spawn('tauri-app.exe', ['--config', '/tmp/config.json']);
```
## Recommendations
1. **Simplify to single communication method** - Either all CLI args OR all IPC messages
2. **Remove console hijacking** - Use proper logging/debug channels
3. **Use consistent message format** - Same structure for all message types
4. **Consider Tauri's built-in IPC** - Events, commands, or invoke system
5. **Separate concerns** - Config passing vs. runtime messaging
## Questions for Review
1. Do we need bidirectional messaging during runtime, or just initial config passing?
2. Should console messages be forwarded, or use proper debug channels?
3. Is the complexity worth it, or should we use simpler CLI args + file output?
4. Could we use Tauri's built-in event system instead of stdout parsing?
## Current Status
- ✅ Config passing works (init_data messages)
- ✅ Image passing works (base64 via IPC)
- ✅ GUI → CLI messaging works (via Tauri command)
- ✅ CLI → GUI messaging works (debug messages)
- ❌ System is overly complex with multiple communication paths
- ❌ Inconsistent message formats
- ❌ Console hijacking adds unnecessary complexity