Files
web-hosts/domains/coppertone.tech/docs/LLM-Integration-Plan.md
2025-12-26 13:38:04 +01:00

14 KiB

LLM Integration Plan

Overview

This document outlines the implementation plan for integrating multiple Large Language Model (LLM) providers into the Copper Tone Technologies platform. The system will allow users to configure API keys for different providers and interact with AI assistants through a unified chat interface.

Supported LLM Providers

  1. OpenAI (ChatGPT) - GPT-4, GPT-4o, GPT-3.5-turbo
  2. Google Gemini - Gemini Pro, Gemini Flash
  3. Anthropic Claude - Claude Sonnet, Opus, Haiku
  4. Qwen AI - Qwen Max, Qwen Plus
  5. HuggingFace - Meta Llama 3.3, Mistral, and other open-source models

Architecture

Frontend Components

1. ChatBox Component (components/ui/ChatBox.vue)

  • Status: Created
  • Features:
    • Floating chat button (bottom-right corner)
    • Expandable/collapsible chat window
    • Provider selection dropdown
    • Message history display
    • Typing indicator
    • Message input with send button
    • Smooth animations and transitions
  • Events:
    • open - Triggered when chat is opened
    • close - Triggered when chat is closed
    • providerChange - Emitted when user switches providers
    • messageSent - Emitted when user sends a message
  • Exposed Methods:
    • addAssistantMessage(content: string) - Add AI response to chat
    • stopTyping() - Stop typing indicator

2. LLM Settings View (views/LLMSettingsView.vue)

  • Status: Created
  • Route: /settings/llm (protected, requires authentication)
  • Features:
    • Grid display of provider cards
    • Configuration status indicators
    • API key management
    • Model selection
    • Temperature and max tokens configuration
    • Secure API key storage
    • Information section about API usage

3. LLM Provider Card (components/views/LLMSettings/LLMProviderCard.vue)

  • Status: Created
  • Features:
    • Provider icon and description
    • Configuration status badge
    • Secure API key input (show/hide toggle)
    • Model selection with default values
    • Advanced settings (temperature, max tokens)
    • Edit/Save/Delete actions
    • Confirmation dialogs for destructive actions

State Management

LLM Store (stores/llm.ts)

  • Status: Created
  • State:
    • configs - API configurations for each provider
    • isConfigured - Boolean flags indicating which providers are configured
    • chatHistory - Message history per provider
    • error - Error messages
    • isLoading - Loading state
  • Actions:
    • loadConfigs() - Load configurations from backend
    • saveConfig(provider, config) - Save provider configuration
    • deleteConfig(provider) - Delete provider configuration
    • sendMessage(provider, message) - Send message to LLM
    • clearHistory(provider) - Clear chat history
    • getDefaultModel(provider) - Get default model for provider

Backend Service

LLM Service (Go)

  • Status: Needs Implementation
  • Location: backend/functions/llm-service/
  • Port: 8085
  • Database Tables:
    CREATE TABLE llm_configs (
      id SERIAL PRIMARY KEY,
      user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
      provider VARCHAR(50) NOT NULL, -- 'openai', 'gemini', 'claude', 'qwen', 'huggingface'
      api_key_encrypted TEXT NOT NULL, -- Encrypted API key
      model VARCHAR(100),
      temperature DECIMAL(3, 2) DEFAULT 0.7,
      max_tokens INTEGER DEFAULT 2048,
      created_at TIMESTAMP DEFAULT NOW(),
      updated_at TIMESTAMP DEFAULT NOW(),
      UNIQUE(user_id, provider)
    );
    
    CREATE TABLE llm_chat_history (
      id SERIAL PRIMARY KEY,
      user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
      provider VARCHAR(50) NOT NULL,
      role VARCHAR(20) NOT NULL, -- 'user', 'assistant', 'system'
      content TEXT NOT NULL,
      tokens_used INTEGER,
      created_at TIMESTAMP DEFAULT NOW(),
      INDEX idx_user_provider (user_id, provider)
    );
    

API Endpoints

1. Get All Configurations
GET /llm/configs
Authorization: Bearer {jwt_token}

Response:
{
  "configs": {
    "openai": {
      "model": "gpt-4o",
      "temperature": 0.7,
      "maxTokens": 2048,
      "apiKey": "sk-***" // Masked for security
    },
    "gemini": { ... }
  }
}
2. Save Configuration
POST /llm/config/{provider}
Authorization: Bearer {jwt_token}
Content-Type: application/json

Request:
{
  "apiKey": "sk-...",
  "model": "gpt-4o",
  "temperature": 0.7,
  "maxTokens": 2048
}

Response:
{
  "success": true,
  "message": "Configuration saved"
}
3. Delete Configuration
DELETE /llm/config/{provider}
Authorization: Bearer {jwt_token}

Response:
{
  "success": true,
  "message": "Configuration deleted"
}
4. Send Chat Message
POST /llm/chat/{provider}
Authorization: Bearer {jwt_token}
Content-Type: application/json

Request:
{
  "message": "Hello, how are you?",
  "history": [
    { "role": "user", "content": "Previous message", "timestamp": "2025-11-24T..." },
    { "role": "assistant", "content": "Previous response", "timestamp": "2025-11-24T..." }
  ]
}

Response:
{
  "response": "I'm doing well, thank you! How can I assist you today?",
  "tokensUsed": 45,
  "model": "gpt-4o"
}
5. Get Chat History
GET /llm/history/{provider}?limit=50&offset=0
Authorization: Bearer {jwt_token}

Response:
{
  "history": [
    {
      "id": 123,
      "role": "user",
      "content": "Hello",
      "tokensUsed": 2,
      "createdAt": "2025-11-24T..."
    },
    {
      "id": 124,
      "role": "assistant",
      "content": "Hi! How can I help?",
      "tokensUsed": 8,
      "createdAt": "2025-11-24T..."
    }
  ],
  "total": 150
}
6. Clear Chat History
DELETE /llm/history/{provider}
Authorization: Bearer {jwt_token}

Response:
{
  "success": true,
  "message": "Chat history cleared"
}

Provider-Specific Implementation

1. OpenAI Integration

Library: github.com/sashabaranov/go-openai

Configuration:

client := openai.NewClient(apiKey)
resp, err := client.CreateChatCompletion(
  context.Background(),
  openai.ChatCompletionRequest{
    Model: config.Model,
    Messages: messages,
    Temperature: config.Temperature,
    MaxTokens: config.MaxTokens,
  },
)

Default Model: gpt-4o API Documentation: https://platform.openai.com/docs/api-reference

2. Google Gemini Integration

Library: github.com/google/generative-ai-go

Configuration:

client, err := genai.NewClient(ctx, option.WithAPIKey(apiKey))
model := client.GenerativeModel(config.Model)
model.SetTemperature(config.Temperature)
model.SetMaxOutputTokens(int32(config.MaxTokens))
resp, err := model.GenerateContent(ctx, genai.Text(message))

Default Model: gemini-2.0-flash-exp API Documentation: https://ai.google.dev/docs

3. Anthropic Claude Integration

Library: github.com/anthropics/anthropic-sdk-go

Configuration:

client := anthropic.NewClient(anthropic.WithAPIKey(apiKey))
resp, err := client.Messages.Create(context.Background(), anthropic.MessageCreateParams{
  Model: config.Model,
  Messages: messages,
  Temperature: anthropic.Float(config.Temperature),
  MaxTokens: config.MaxTokens,
})

Default Model: claude-sonnet-4-5-20250929 API Documentation: https://docs.anthropic.com/claude/reference

4. Qwen AI Integration

Library: HTTP client with API calls to DashScope

Configuration:

// Using DashScope API
url := "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation"
// Custom HTTP request with API key in Authorization header

Default Model: qwen-max API Documentation: https://help.aliyun.com/zh/dashscope/

5. HuggingFace Integration

Library: HTTP client with Inference API

Configuration:

url := fmt.Sprintf("https://api-inference.huggingface.co/models/%s", config.Model)
// Custom HTTP request with API key in Authorization header

Default Model: meta-llama/Llama-3.3-70B-Instruct API Documentation: https://huggingface.co/docs/api-inference/index

Security Considerations

API Key Encryption

  • API keys must be encrypted at rest using AES-256-GCM
  • Encryption key stored in environment variable ENCRYPTION_KEY
  • Keys decrypted only when needed for API calls
  • Never return unencrypted API keys to frontend (mask with sk-***)

Rate Limiting

  • Implement per-user rate limiting (e.g., 100 requests/hour)
  • Prevent abuse of expensive API calls
  • Return 429 status code when rate limit exceeded

Input Validation

  • Validate message content (max length, sanitize HTML)
  • Validate temperature (0.0 - 2.0)
  • Validate max tokens (1 - 32000)
  • Sanitize user inputs to prevent injection attacks

Access Control

  • Only authenticated users can access LLM features
  • Users can only access their own configurations
  • JWT token validation on all endpoints
  • RBAC: All roles (CLIENT, STAFF, ADMIN) can use chatbot

Cost Management

Token Tracking

  • Record tokens used for each request
  • Display usage statistics in settings
  • Optional: Set per-user token limits

Cost Estimation

  • Calculate estimated costs based on provider pricing
  • Display warnings when approaching limits
  • Provider pricing (approximate):
    • OpenAI GPT-4o: $2.50/$10.00 per 1M input/output tokens
    • Gemini Flash: Free tier available, $0.075/$0.30 per 1M tokens
    • Claude Sonnet: $3.00/$15.00 per 1M input/output tokens
    • Qwen: Varies by region
    • HuggingFace: Free for limited usage, paid tiers available

Testing Plan

Unit Tests

  • Test LLM config CRUD operations
  • Test API key encryption/decryption
  • Test message history storage
  • Test rate limiting logic

Integration Tests

  • Test each provider integration with test API keys
  • Verify error handling for invalid keys
  • Test chat completion flow end-to-end

E2E Tests (Cypress)

  • Test opening/closing chatbox
  • Test provider selection
  • Test sending messages
  • Test configuration management in settings

Deployment

Environment Variables

# LLM Service
LLM_SERVICE_PORT=8085
ENCRYPTION_KEY=<64-character-hex-string>

# Database
DB_HOST=db
DB_PORT=5432
DB_USER=user
DB_PASSWORD=password
DB_NAME=coppertone_db

# JWT
JWT_SECRET=<same-as-auth-service>

podman-compose.yml

llm-service:
  build:
    context: ./backend/functions/llm-service
    dockerfile: Containerfile
  ports:
    - "8085:8080"
  environment:
    - DB_HOST=db
    - DB_USER=user
    - DB_PASSWORD=password
    - DB_NAME=coppertone_db
    - JWT_SECRET=${JWT_SECRET}
    - ENCRYPTION_KEY=${ENCRYPTION_KEY}
  depends_on:
    - db
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
    interval: 30s
    timeout: 10s
    retries: 3

Migration

-- migrations/005_create_llm_tables.up.sql
CREATE TABLE IF NOT EXISTS llm_configs (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL,
  api_key_encrypted TEXT NOT NULL,
  model VARCHAR(100),
  temperature DECIMAL(3, 2) DEFAULT 0.7,
  max_tokens INTEGER DEFAULT 2048,
  created_at TIMESTAMP DEFAULT NOW(),
  updated_at TIMESTAMP DEFAULT NOW(),
  UNIQUE(user_id, provider)
);

CREATE INDEX idx_llm_configs_user ON llm_configs(user_id);

CREATE TABLE IF NOT EXISTS llm_chat_history (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  provider VARCHAR(50) NOT NULL,
  role VARCHAR(20) NOT NULL,
  content TEXT NOT NULL,
  tokens_used INTEGER,
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_llm_history_user_provider ON llm_chat_history(user_id, provider, created_at DESC);

Implementation Checklist

Frontend (Completed)

  • ChatBox component created
  • LLM Settings view created
  • LLM Provider Card component created
  • LLM store (Pinia) created
  • Router updated with settings route
  • Add chatbox to dashboard layouts
  • Add "AI Assistant Settings" link to dashboard navigation
  • Test UI components locally

Backend (To Do)

  • Create backend/functions/llm-service/ directory
  • Initialize Go module
  • Implement API key encryption/decryption
  • Create database migration (005)
  • Implement GET /llm/configs endpoint
  • Implement POST /llm/config/{provider} endpoint
  • Implement DELETE /llm/config/{provider} endpoint
  • Implement POST /llm/chat/{provider} endpoint with all providers
  • Implement GET /llm/history/{provider} endpoint
  • Implement DELETE /llm/history/{provider} endpoint
  • Add rate limiting middleware
  • Add input validation
  • Create Containerfile
  • Update podman-compose.yml
  • Add health check endpoint
  • Write unit tests
  • Write integration tests

Deployment

  • Add environment variables to deployment
  • Test in staging environment
  • Update PRODUCTION_CHECKLIST.md
  • Document API endpoints in API documentation

Future Enhancements

  1. Streaming Responses: Implement Server-Sent Events (SSE) for real-time streaming
  2. File Attachments: Allow users to upload files for analysis
  3. Conversation Management: Save and organize multiple conversation threads
  4. Prompt Templates: Pre-built prompts for common tasks
  5. Multi-Model Comparison: Send same message to multiple models and compare responses
  6. Custom System Prompts: Allow users to set custom system prompts per provider
  7. Usage Analytics Dashboard: Visualize token usage and costs over time
  8. Admin Monitoring: ADMIN users can see platform-wide LLM usage statistics

Resources


Document Version: 1.0.0 Last Updated: 2025-11-24 Status: Frontend Complete, Backend Planned