Files

Derek Williams c7d6786039 Add full web-hosts state

2025-12-26 13:38:04 +01:00

14 KiB

Raw Blame History

LLM Integration Plan

Overview

This document outlines the implementation plan for integrating multiple Large Language Model (LLM) providers into the Copper Tone Technologies platform. The system will allow users to configure API keys for different providers and interact with AI assistants through a unified chat interface.

Supported LLM Providers

OpenAI (ChatGPT) - GPT-4, GPT-4o, GPT-3.5-turbo
Google Gemini - Gemini Pro, Gemini Flash
Anthropic Claude - Claude Sonnet, Opus, Haiku
Qwen AI - Qwen Max, Qwen Plus
HuggingFace - Meta Llama 3.3, Mistral, and other open-source models

Architecture

Frontend Components

1. ChatBox Component (`components/ui/ChatBox.vue`)

Status: ✅ Created
Features:
- Floating chat button (bottom-right corner)
- Expandable/collapsible chat window
- Provider selection dropdown
- Message history display
- Typing indicator
- Message input with send button
- Smooth animations and transitions
Events:
- open - Triggered when chat is opened
- close - Triggered when chat is closed
- providerChange - Emitted when user switches providers
- messageSent - Emitted when user sends a message
Exposed Methods:
- addAssistantMessage(content: string) - Add AI response to chat
- stopTyping() - Stop typing indicator

2. LLM Settings View (`views/LLMSettingsView.vue`)

Status: ✅ Created
Route: /settings/llm (protected, requires authentication)
Features:
- Grid display of provider cards
- Configuration status indicators
- API key management
- Model selection
- Temperature and max tokens configuration
- Secure API key storage
- Information section about API usage

3. LLM Provider Card (`components/views/LLMSettings/LLMProviderCard.vue`)

Status: ✅ Created
Features:
- Provider icon and description
- Configuration status badge
- Secure API key input (show/hide toggle)
- Model selection with default values
- Advanced settings (temperature, max tokens)
- Edit/Save/Delete actions
- Confirmation dialogs for destructive actions

State Management

LLM Store (`stores/llm.ts`)

Status: ✅ Created
State:
- configs - API configurations for each provider
- isConfigured - Boolean flags indicating which providers are configured
- chatHistory - Message history per provider
- error - Error messages
- isLoading - Loading state
Actions:
- loadConfigs() - Load configurations from backend
- saveConfig(provider, config) - Save provider configuration
- deleteConfig(provider) - Delete provider configuration
- sendMessage(provider, message) - Send message to LLM
- clearHistory(provider) - Clear chat history
- getDefaultModel(provider) - Get default model for provider

Backend Service

LLM Service (Go)

Status: ⏳ Needs Implementation
Location: backend/functions/llm-service/
Port: 8085

Database Tables:

CREATE TABLE llm_configs (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL, -- 'openai', 'gemini', 'claude', 'qwen', 'huggingface'
  api_key_encrypted TEXT NOT NULL, -- Encrypted API key
  model VARCHAR(100),
  temperature DECIMAL(3, 2) DEFAULT 0.7,
  max_tokens INTEGER DEFAULT 2048,
  created_at TIMESTAMP DEFAULT NOW(),
  updated_at TIMESTAMP DEFAULT NOW(),
  UNIQUE(user_id, provider)
);

CREATE TABLE llm_chat_history (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL,
  role VARCHAR(20) NOT NULL, -- 'user', 'assistant', 'system'
  content TEXT NOT NULL,
  tokens_used INTEGER,
  created_at TIMESTAMP DEFAULT NOW(),
  INDEX idx_user_provider (user_id, provider)
);

API Endpoints

1. Get All Configurations

GET /llm/configs
Authorization: Bearer {jwt_token}

Response:
{
  "configs": {
    "openai": {
      "model": "gpt-4o",
      "temperature": 0.7,
      "maxTokens": 2048,
      "apiKey": "sk-***" // Masked for security
    },
    "gemini": { ... }
  }
}

2. Save Configuration

POST /llm/config/{provider}
Authorization: Bearer {jwt_token}
Content-Type: application/json

Request:
{
  "apiKey": "sk-...",
  "model": "gpt-4o",
  "temperature": 0.7,
  "maxTokens": 2048
}

Response:
{
  "success": true,
  "message": "Configuration saved"
}

3. Delete Configuration

DELETE /llm/config/{provider}
Authorization: Bearer {jwt_token}

Response:
{
  "success": true,
  "message": "Configuration deleted"
}

4. Send Chat Message

POST /llm/chat/{provider}
Authorization: Bearer {jwt_token}
Content-Type: application/json

Request:
{
  "message": "Hello, how are you?",
  "history": [
    { "role": "user", "content": "Previous message", "timestamp": "2025-11-24T..." },
    { "role": "assistant", "content": "Previous response", "timestamp": "2025-11-24T..." }
  ]
}

Response:
{
  "response": "I'm doing well, thank you! How can I assist you today?",
  "tokensUsed": 45,
  "model": "gpt-4o"
}

5. Get Chat History

GET /llm/history/{provider}?limit=50&offset=0
Authorization: Bearer {jwt_token}

Response:
{
  "history": [
    {
      "id": 123,
      "role": "user",
      "content": "Hello",
      "tokensUsed": 2,
      "createdAt": "2025-11-24T..."
    },
    {
      "id": 124,
      "role": "assistant",
      "content": "Hi! How can I help?",
      "tokensUsed": 8,
      "createdAt": "2025-11-24T..."
    }
  ],
  "total": 150
}

6. Clear Chat History

DELETE /llm/history/{provider}
Authorization: Bearer {jwt_token}

Response:
{
  "success": true,
  "message": "Chat history cleared"
}

Provider-Specific Implementation

1. OpenAI Integration

Library: github.com/sashabaranov/go-openai

Configuration:

client := openai.NewClient(apiKey)
resp, err := client.CreateChatCompletion(
  context.Background(),
  openai.ChatCompletionRequest{
    Model: config.Model,
    Messages: messages,
    Temperature: config.Temperature,
    MaxTokens: config.MaxTokens,
  },
)

Default Model: gpt-4o API Documentation: https://platform.openai.com/docs/api-reference

2. Google Gemini Integration

Library: github.com/google/generative-ai-go

Configuration:

client, err := genai.NewClient(ctx, option.WithAPIKey(apiKey))
model := client.GenerativeModel(config.Model)
model.SetTemperature(config.Temperature)
model.SetMaxOutputTokens(int32(config.MaxTokens))
resp, err := model.GenerateContent(ctx, genai.Text(message))

Default Model: gemini-2.0-flash-exp API Documentation: https://ai.google.dev/docs

3. Anthropic Claude Integration

Library: github.com/anthropics/anthropic-sdk-go

Configuration:

client := anthropic.NewClient(anthropic.WithAPIKey(apiKey))
resp, err := client.Messages.Create(context.Background(), anthropic.MessageCreateParams{
  Model: config.Model,
  Messages: messages,
  Temperature: anthropic.Float(config.Temperature),
  MaxTokens: config.MaxTokens,
})

Default Model: claude-sonnet-4-5-20250929 API Documentation: https://docs.anthropic.com/claude/reference

4. Qwen AI Integration

Library: HTTP client with API calls to DashScope

Configuration:

// Using DashScope API
url := "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation"
// Custom HTTP request with API key in Authorization header

Default Model: qwen-max API Documentation: https://help.aliyun.com/zh/dashscope/

5. HuggingFace Integration

Library: HTTP client with Inference API

Configuration:

url := fmt.Sprintf("https://api-inference.huggingface.co/models/%s", config.Model)
// Custom HTTP request with API key in Authorization header

Default Model: meta-llama/Llama-3.3-70B-Instruct API Documentation: https://huggingface.co/docs/api-inference/index

Security Considerations

API Key Encryption

API keys must be encrypted at rest using AES-256-GCM
Encryption key stored in environment variable ENCRYPTION_KEY
Keys decrypted only when needed for API calls
Never return unencrypted API keys to frontend (mask with sk-***)

Rate Limiting

Implement per-user rate limiting (e.g., 100 requests/hour)
Prevent abuse of expensive API calls
Return 429 status code when rate limit exceeded

Input Validation

Validate message content (max length, sanitize HTML)
Validate temperature (0.0 - 2.0)
Validate max tokens (1 - 32000)
Sanitize user inputs to prevent injection attacks

Access Control

Only authenticated users can access LLM features
Users can only access their own configurations
JWT token validation on all endpoints
RBAC: All roles (CLIENT, STAFF, ADMIN) can use chatbot

Cost Management

Token Tracking

Record tokens used for each request
Display usage statistics in settings
Optional: Set per-user token limits

Cost Estimation

Calculate estimated costs based on provider pricing
Display warnings when approaching limits
Provider pricing (approximate):
- OpenAI GPT-4o: $2.50/$10.00 per 1M input/output tokens
- Gemini Flash: Free tier available, $0.075/$0.30 per 1M tokens
- Claude Sonnet: $3.00/$15.00 per 1M input/output tokens
- Qwen: Varies by region
- HuggingFace: Free for limited usage, paid tiers available

Testing Plan

Unit Tests

Test LLM config CRUD operations
Test API key encryption/decryption
Test message history storage
Test rate limiting logic

Integration Tests

Test each provider integration with test API keys
Verify error handling for invalid keys
Test chat completion flow end-to-end

E2E Tests (Cypress)

Test opening/closing chatbox
Test provider selection
Test sending messages
Test configuration management in settings

Deployment

Environment Variables

# LLM Service
LLM_SERVICE_PORT=8085
ENCRYPTION_KEY=<64-character-hex-string>

# Database
DB_HOST=db
DB_PORT=5432
DB_USER=user
DB_PASSWORD=password
DB_NAME=coppertone_db

# JWT
JWT_SECRET=<same-as-auth-service>

podman-compose.yml

llm-service:
  build:
    context: ./backend/functions/llm-service
    dockerfile: Containerfile
  ports:
    - "8085:8080"
  environment:
    - DB_HOST=db
    - DB_USER=user
    - DB_PASSWORD=password
    - DB_NAME=coppertone_db
    - JWT_SECRET=${JWT_SECRET}
    - ENCRYPTION_KEY=${ENCRYPTION_KEY}
  depends_on:
    - db
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
    interval: 30s
    timeout: 10s
    retries: 3

Migration

-- migrations/005_create_llm_tables.up.sql
CREATE TABLE IF NOT EXISTS llm_configs (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL,
  api_key_encrypted TEXT NOT NULL,
  model VARCHAR(100),
  temperature DECIMAL(3, 2) DEFAULT 0.7,
  max_tokens INTEGER DEFAULT 2048,
  created_at TIMESTAMP DEFAULT NOW(),
  updated_at TIMESTAMP DEFAULT NOW(),
  UNIQUE(user_id, provider)
);

CREATE INDEX idx_llm_configs_user ON llm_configs(user_id);

CREATE TABLE IF NOT EXISTS llm_chat_history (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL,
  role VARCHAR(20) NOT NULL,
  content TEXT NOT NULL,
  tokens_used INTEGER,
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_llm_history_user_provider ON llm_chat_history(user_id, provider, created_at DESC);

Implementation Checklist

Frontend (Completed)

ChatBox component created
LLM Settings view created
LLM Provider Card component created
LLM store (Pinia) created
Router updated with settings route
Add chatbox to dashboard layouts
Add "AI Assistant Settings" link to dashboard navigation
Test UI components locally

Backend (To Do)

Create backend/functions/llm-service/ directory
Initialize Go module
Implement API key encryption/decryption
Create database migration (005)
Implement GET /llm/configs endpoint
Implement POST /llm/config/{provider} endpoint
Implement DELETE /llm/config/{provider} endpoint
Implement POST /llm/chat/{provider} endpoint with all providers
Implement GET /llm/history/{provider} endpoint
Implement DELETE /llm/history/{provider} endpoint
Add rate limiting middleware
Add input validation
Create Containerfile
Update podman-compose.yml
Add health check endpoint
Write unit tests
Write integration tests

Deployment

Add environment variables to deployment
Test in staging environment
Update PRODUCTION_CHECKLIST.md
Document API endpoints in API documentation

Future Enhancements

Streaming Responses: Implement Server-Sent Events (SSE) for real-time streaming
File Attachments: Allow users to upload files for analysis
Conversation Management: Save and organize multiple conversation threads
Prompt Templates: Pre-built prompts for common tasks
Multi-Model Comparison: Send same message to multiple models and compare responses
Custom System Prompts: Allow users to set custom system prompts per provider
Usage Analytics Dashboard: Visualize token usage and costs over time
Admin Monitoring: ADMIN users can see platform-wide LLM usage statistics

Resources

OpenAI API: https://platform.openai.com/docs
Google Gemini: https://ai.google.dev/docs
Anthropic Claude: https://docs.anthropic.com/claude/reference
Qwen AI: https://help.aliyun.com/zh/dashscope/
HuggingFace: https://huggingface.co/docs/api-inference/index

Document Version: 1.0.0 Last Updated: 2025-11-24 Status: Frontend Complete, Backend Planned

14 KiB Raw Blame History

LLM Integration Plan

Overview

Supported LLM Providers

Architecture

Frontend Components

1. ChatBox Component (components/ui/ChatBox.vue)

2. LLM Settings View (views/LLMSettingsView.vue)

3. LLM Provider Card (components/views/LLMSettings/LLMProviderCard.vue)

State Management

LLM Store (stores/llm.ts)

Backend Service

LLM Service (Go)

API Endpoints

1. Get All Configurations

2. Save Configuration

3. Delete Configuration

4. Send Chat Message

5. Get Chat History

6. Clear Chat History

Provider-Specific Implementation

1. OpenAI Integration

2. Google Gemini Integration

3. Anthropic Claude Integration

4. Qwen AI Integration

5. HuggingFace Integration

Security Considerations

API Key Encryption

Rate Limiting

Input Validation

Access Control

Cost Management

Token Tracking

Cost Estimation

Testing Plan

Unit Tests

Integration Tests

E2E Tests (Cypress)

Deployment

Environment Variables

podman-compose.yml

Migration

Implementation Checklist

Frontend (Completed)

Backend (To Do)

Deployment

Future Enhancements

Resources

14 KiB

Raw Blame History

1. ChatBox Component (`components/ui/ChatBox.vue`)

2. LLM Settings View (`views/LLMSettingsView.vue`)

3. LLM Provider Card (`components/views/LLMSettings/LLMProviderCard.vue`)

LLM Store (`stores/llm.ts`)