Files
Vaayne 75766dbfdc feat: Add POC command page structure and routing
- Created CommandPocPage.tsx with basic layout structure
- Added POC-specific TypeScript interfaces and types
- Implemented basic UI components: PocHeader, PocMessageList, PocMessageBubble, PocCommandInput, PocStatusBar
- Added /command-poc route to Router.tsx
- Set up component folder structure following PRD specifications

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-03 11:43:02 +08:00

23 KiB

Product Requirements Document (PRD)

Cherry Studio AI Agent Command Interface

1. Overview

Product Name: Cherry Studio AI Agent Command Interface
Version: 1.0
Date: July 30, 2025

Vision: Create a conversational AI Agent interface in Cherry Studio that enables users to execute shell commands through natural language interaction, with seamless communication between the renderer and main processes, providing an intelligent command execution experience.

2. Scope & Objectives

This PRD focuses on two core areas:

2.1 Core Implementation Scope

  • Renderer ↔ Main Process Communication: Robust IPC communication for command execution
  • Shell Command Execution: Safe and efficient shell command processing in the main process
  • Real-time Output Streaming: Live command output display integrated into chat interface
  • AI Agent Integration: Natural language command interpretation and execution workflow

2.2 UI/UX Design Scope

  • Conversational Interface Design: Chat-like UI that fits Cherry Studio's design language
  • Command Agent Experience: AI-powered command interpretation and execution feedback
  • Interactive Output Display: Rich formatting of command results within chat messages
  • Responsive Design: Consistent chat experience across different window sizes and layouts

3. Technical Requirements

3.1 Core Implementation Requirements

3.1.1 IPC Communication Architecture

Requirement: Establish bidirectional communication between renderer and main processes for AI Agent command execution

Technical Specifications:

  • Agent Command Request Flow: Renderer → Main Process

    interface AgentCommandRequest {
      id: string
      messageId: string  // Chat message ID for correlation
      command: string
      workingDirectory?: string
      timeout?: number
      environment?: Record<string, string>
      context?: string   // Additional context from chat conversation
    }
    
  • Agent Output Streaming Flow: Main Process → Renderer

    interface AgentCommandOutput {
      id: string
      messageId: string  // Chat message ID for correlation
      type: 'stdout' | 'stderr' | 'exit' | 'error' | 'progress'
      data: string
      exitCode?: number
      timestamp: number
    }
    
  • IPC Channel Names:

    • agent-command-execute (Renderer → Main)
    • agent-command-output (Main → Renderer)
    • agent-command-interrupt (Renderer → Main)
3.1.2 Main Process Agent Command Service

Requirement: Create a new AgentCommandService in the main process

Technical Specifications:

  • Service Location: src/main/services/AgentCommandService.ts

  • Core Methods:

    class AgentCommandService {
      executeCommand(request: AgentCommandRequest): Promise<void>
      interruptCommand(commandId: string): Promise<void>
      getRunningCommands(): string[]
      setWorkingDirectory(path: string): void
      formatCommandOutput(output: string, type: string): string
    }
    
  • Process Management:

    • Use Node.js child_process.spawn() for command execution
    • Support real-time stdout/stderr streaming to chat interface
    • Handle process interruption via chat commands
    • Maintain working directory state per agent session
    • Format output for better chat display (tables, JSON, etc.)
  • Error Handling:

    • Command not found errors with helpful suggestions
    • Permission denied errors with explanations
    • Timeout handling with progress updates
    • Process termination with cleanup notifications
3.1.3 Renderer Process Integration

Requirement: Implement AI Agent command functionality in the renderer process

Technical Specifications:

  • Service Location: src/renderer/src/services/AgentCommandService.ts
  • Component Integration: Agent chat page and command execution components
  • State Management: Chat session state, command history, output formatting
  • Message Correlation: Link command outputs to specific chat messages

3.2 Performance Requirements

  • Command Response Time: < 100ms for command initiation
  • Output Streaming Latency: < 50ms for real-time output display
  • Memory Management: Efficient handling of large command outputs (>10MB)
  • Concurrent Commands: Support up to 5 simultaneous command executions

3.3 Security Requirements

  • Command Validation: Basic validation for dangerous commands
  • Working Directory Restrictions: Respect file system permissions
  • Environment Variable Handling: Secure handling of environment variables
  • Process Isolation: Commands run with application user privileges

4. UI/UX Design Requirements

4.1 Design Principles

Target Audience: Senior Frontend and UI Designers
Design Goals: Create an intuitive, conversational AI Agent interface that enhances developer productivity through natural language command execution

4.1.1 Visual Design Requirements
  • Design System Integration: Follow Cherry Studio's existing chat design patterns
  • Theme Support: Light/dark theme compatibility
  • Typography: Mix of regular chat font and monospace for command outputs
  • Color Scheme: Distinct styling for user messages, agent responses, and command outputs
  • Message Bubbles: Clear visual distinction between conversation and command execution
4.1.2 Layout Requirements

Primary Layout Structure (Chat Interface):

┌─────────────────────────────────────┐
│ Agent Header (name + status + controls) │
├─────────────────────────────────────┤
│                                     │
│        Chat Messages Area           │
│     (user messages + agent replies  │
│      + command outputs)             │
│                                     │
├─────────────────────────────────────┤
│ Message Input (natural language)    │
└─────────────────────────────────────┘

Responsive Considerations:

  • Minimum width: 320px (mobile)
  • Optimal width: 600-800px (desktop)
  • Message bubbles adapt to content width
  • Command outputs can expand full width
4.1.3 Component Specifications

Agent Header Component:

  • Agent name and avatar
  • Working directory indicator
  • Active command status (running/idle)
  • Session controls (clear chat, export logs)

Chat Messages Component:

  • User Messages: Standard chat bubbles for natural language input
  • Agent Responses: AI responses explaining commands or asking for clarification
  • Command Execution Messages: Special formatting for:
    • Command being executed (with syntax highlighting)
    • Real-time output streaming (scrollable, copyable)
    • Execution status (success/error/interrupted)
    • Formatted results (tables, JSON, file listings)

Message Input Component:

  • Natural language input field
  • Send button with loading state during command execution
  • Suggestion chips for common requests
  • Support for follow-up questions and command modifications

4.2 User Experience Requirements

4.2.1 Interaction Patterns

Conversational Flow:

  • User types natural language requests ("list files in src directory")
  • Agent interprets and confirms command before execution
  • Real-time command output appears in chat
  • User can ask follow-up questions or modify commands

Keyboard Shortcuts:

  • Enter: Send message/command
  • Ctrl+Enter: Force command execution without confirmation
  • Ctrl+K: Interrupt running command
  • Ctrl+L: Clear chat history
  • ↑/↓: Navigate message input history

Mouse Interactions:

  • Click on command outputs to copy
  • Click on file paths to open in Cherry Studio
  • Hover over commands for quick actions (copy, re-run, modify)
4.2.2 Feedback & Status Indicators

Visual Feedback Requirements:

  • Agent Thinking: Typing indicator while processing user request
  • Command Execution: Progress indicator and real-time output streaming
  • Execution Status: Success/error/warning indicators in message bubbles
  • Working Directory: Persistent display in agent header
  • Command History: Visual indication of previous commands in chat
4.2.3 Accessibility Requirements
  • Keyboard Navigation: Full chat functionality accessible via keyboard
  • Screen Reader Support: Proper ARIA labels for chat messages and command outputs
  • High Contrast: Support for high contrast themes in all message types
  • Focus Management: Logical tab order through chat interface

4.3 Advanced UX Features (Future Considerations)

  • Command Suggestions: AI-powered suggestions based on current context
  • Smart Output Formatting: Automatic formatting for JSON, tables, logs, etc.
  • File Integration: Deep integration with Cherry Studio's file management
  • Session Memory: Agent remembers context across chat sessions
  • Multi-step Workflows: Support for complex, multi-command operations

5. Implementation Approach

5.1 Development Phases

Phase 1: Core Infrastructure (2-3 weeks)

  • Implement AgentCommandService in main process
  • Establish IPC communication for chat-command flow
  • Basic command execution and output streaming to chat interface

Phase 2: AI Agent Chat Interface (3-4 weeks)

  • Design and implement conversational chat components
  • Create command execution message types and formatting
  • Integrate natural language command interpretation
  • Implement real-time output streaming in chat bubbles

Phase 3: Enhanced Agent Features (2-3 weeks)

  • Add command confirmation and clarification flows
  • Implement smart output formatting (tables, JSON, etc.)
  • Add working directory management in chat context
  • Integrate with Cherry Studio's existing AI infrastructure

5.2 Integration Points

  • Router Integration: Add /agent or /command-agent route to src/renderer/src/Router.tsx
  • Navigation: Add agent icon to Cherry Studio's main navigation
  • AI Core Integration: Leverage existing AI infrastructure for command interpretation
  • Settings Integration: Agent preferences in application settings
  • Chat System: Reuse existing chat components and patterns from Cherry Studio

6. Success Metrics

6.1 Technical Metrics

  • Command execution success rate: >99%
  • Average command response time: <100ms
  • Output streaming latency: <50ms
  • Zero memory leaks during extended usage

6.2 User Experience Metrics

  • User adoption rate within first month
  • Average chat session duration
  • Natural language command interpretation accuracy
  • Command execution success rate through conversational interface
  • User feedback scores on AI Agent usability and helpfulness

7. Dependencies & Constraints

7.1 Technical Dependencies

  • Node.js child_process module
  • Electron IPC capabilities
  • Cherry Studio's existing service architecture
  • React/TypeScript frontend stack
  • Cherry Studio's AI Core infrastructure
  • Existing chat components and design system

7.2 Platform Constraints

  • Cross-platform compatibility (Windows, macOS, Linux)
  • Shell availability on target platforms
  • File system permission handling

8. Proof of Concept (POC) Implementation

8.1 POC Objectives

Primary Goal: Validate the core concept of chat-based command execution with minimal implementation complexity.

Key Validation Points:

  • User experience of command execution through chat interface
  • Technical feasibility of IPC communication for real-time output streaming
  • Performance characteristics of command output display in chat bubbles
  • Cross-platform compatibility of basic shell command execution

8.2 POC Scope & Limitations

8.2.1 Included Features

Direct Command Execution: Users type shell commands directly (no AI interpretation)
Real-time Output Streaming: Command output appears live in chat bubbles
Basic Chat Interface: Simple message list with input field
Command History: Navigate previous commands with arrow keys
Cross-platform Support: Works on Windows, macOS, and Linux
Process Management: Start/stop command execution

8.2.2 Excluded Features (Future Work)

AI natural language interpretation of commands
Command confirmation or clarification flows
Advanced output formatting (tables, JSON highlighting)
Security validation and command filtering
Session persistence between app restarts
Multiple concurrent command execution
Working directory management UI
Integration with Cherry Studio's AI core

8.3 Technical Architecture

8.3.1 Component Structure

src/renderer/src/pages/command-poc/
├── CommandPocPage.tsx              # Main container component
├── components/
│   ├── PocHeader.tsx              # Header with working directory
│   ├── PocMessageList.tsx         # Scrollable message container
│   ├── PocMessageBubble.tsx       # Individual message display
│   ├── PocCommandInput.tsx        # Command input with history
│   └── PocStatusBar.tsx           # Command execution status
├── hooks/
│   ├── usePocMessages.ts          # Message state management
│   ├── usePocCommand.ts           # Command execution logic
│   └── useCommandHistory.ts       # Input history navigation
└── types.ts                       # POC-specific TypeScript interfaces

8.3.2 Data Structures

interface PocMessage {
  id: string
  type: 'user-command' | 'output' | 'error' | 'system'
  content: string
  timestamp: number
  commandId?: string  // Links output to originating command
  isComplete: boolean // For streaming messages
}

interface PocCommandExecution {
  id: string
  command: string
  startTime: number
  endTime?: number
  exitCode?: number
  isRunning: boolean
}

8.3.3 IPC Communication

// Renderer → Main Process
interface PocExecuteCommandRequest {
  id: string
  command: string
  workingDirectory: string
}

// Main Process → Renderer
interface PocCommandOutput {
  commandId: string
  type: 'stdout' | 'stderr' | 'exit' | 'error'
  data: string
  exitCode?: number
}

// IPC Channels
const IPC_CHANNELS = {
  EXECUTE_COMMAND: 'poc-execute-command',
  COMMAND_OUTPUT: 'poc-command-output',
  INTERRUPT_COMMAND: 'poc-interrupt-command'
}

8.4 Implementation Details

8.4.1 Main Process Implementation

File: src/main/poc/commandExecutor.ts

class PocCommandExecutor {
  private activeProcesses = new Map<string, ChildProcess>()
  
  executeCommand(request: PocExecuteCommandRequest) {
    const { spawn } = require('child_process')
    const shell = process.platform === 'win32' ? 'cmd' : 'bash'
    const args = process.platform === 'win32' ? ['/c'] : ['-c']
    
    const child = spawn(shell, [...args, request.command], {
      cwd: request.workingDirectory
    })
    
    this.activeProcesses.set(request.id, child)
    
    // Stream output handling
    child.stdout.on('data', (data) => {
      this.sendOutput(request.id, 'stdout', data.toString())
    })
    
    child.stderr.on('data', (data) => {
      this.sendOutput(request.id, 'stderr', data.toString())
    })
    
    child.on('close', (code) => {
      this.sendOutput(request.id, 'exit', '', code)
      this.activeProcesses.delete(request.id)
    })
  }
}

8.4.2 Renderer Process Implementation

State Management Strategy:

const usePocMessages = () => {
  const [messages, setMessages] = useState<PocMessage[]>([])
  const [activeCommand, setActiveCommand] = useState<string | null>(null)
  
  const addUserCommand = (command: string) => {
    const commandMessage: PocMessage = {
      id: uuid(),
      type: 'user-command',
      content: command,
      timestamp: Date.now(),
      isComplete: true
    }
    
    const outputMessage: PocMessage = {
      id: uuid(),
      type: 'output',
      content: '',
      timestamp: Date.now(),
      commandId: commandMessage.id,
      isComplete: false
    }
    
    setMessages(prev => [...prev, commandMessage, outputMessage])
    return outputMessage.id
  }
  
  const appendOutput = (messageId: string, data: string) => {
    setMessages(prev => prev.map(msg => 
      msg.id === messageId 
        ? { ...msg, content: msg.content + data }
        : msg
    ))
  }
}

Output Streaming with Buffering:

const useOutputBuffer = () => {
  const bufferRef = useRef<string>('')
  const timeoutRef = useRef<NodeJS.Timeout>()
  
  const bufferOutput = (data: string, messageId: string) => {
    bufferRef.current += data
    
    clearTimeout(timeoutRef.current)
    timeoutRef.current = setTimeout(() => {
      appendOutput(messageId, bufferRef.current)
      bufferRef.current = ''
    }, 100) // 100ms debounce
  }
}

8.4.3 UI Components

Message Bubble Component:

const PocMessageBubble: React.FC<{ message: PocMessage }> = ({ message }) => {
  const isUserCommand = message.type === 'user-command'
  
  return (
    <MessageContainer isUser={isUserCommand}>
      {isUserCommand ? (
        <CommandBubble>
          <CommandPrefix>$</CommandPrefix>
          <CommandText>{message.content}</CommandText>
        </CommandBubble>
      ) : (
        <OutputBubble>
          <pre>{message.content}</pre>
          {!message.isComplete && <LoadingDots />}
        </OutputBubble>
      )}
    </MessageContainer>
  )
}

Command Input with History:

const PocCommandInput: React.FC = ({ onSendCommand }) => {
  const [input, setInput] = useState('')
  const { history, addToHistory, navigateHistory } = useCommandHistory()
  
  const handleKeyDown = (e: React.KeyboardEvent) => {
    switch (e.key) {
      case 'Enter':
        if (input.trim()) {
          onSendCommand(input.trim())
          addToHistory(input.trim())
          setInput('')
        }
        break
      case 'ArrowUp':
        e.preventDefault()
        setInput(navigateHistory('up'))
        break
      case 'ArrowDown':
        e.preventDefault()
        setInput(navigateHistory('down'))
        break
    }
  }
}

8.5 Cross-Platform Considerations

8.5.1 Shell Detection

const getShellConfig = () => {
  switch (process.platform) {
    case 'win32':
      return { shell: 'cmd', args: ['/c'] }
    case 'darwin':
    case 'linux':
      return { shell: 'bash', args: ['-c'] }
    default:
      return { shell: 'sh', args: ['-c'] }
  }
}

8.5.2 Path Handling

const normalizeWorkingDirectory = (path: string) => {
  return process.platform === 'win32' 
    ? path.replace(/\//g, '\\')
    : path.replace(/\\/g, '/')
}

8.6 Performance Optimizations

8.6.1 Virtual Scrolling

const PocMessageList: React.FC = ({ messages }) => {
  const [visibleRange, setVisibleRange] = useState({ start: 0, end: 50 })
  
  // Only render visible messages for large message lists
  const visibleMessages = messages.slice(
    visibleRange.start, 
    visibleRange.end
  )
  
  return (
    <VirtualScrollContainer onScroll={handleScroll}>
      {visibleMessages.map(message => (
        <PocMessageBubble key={message.id} message={message} />
      ))}
    </VirtualScrollContainer>
  )
}

8.6.2 Output Truncation

const MAX_OUTPUT_LENGTH = 1024 * 1024 // 1MB per message
const MAX_TOTAL_MESSAGES = 1000

const truncateIfNeeded = (content: string) => {
  if (content.length > MAX_OUTPUT_LENGTH) {
    return content.slice(0, MAX_OUTPUT_LENGTH) + '\n\n[Output truncated...]'
  }
  return content
}

8.7 Testing Strategy

8.7.1 Manual Test Cases

  1. Basic Commands:

    • ls -la / dir (directory listing)
    • pwd / cd (working directory)
    • echo "Hello World" (simple output)
  2. Streaming Output:

    • ping google.com -c 5 (timed output)
    • find . -name "*.ts" (large output)
    • npm install (mixed stdout/stderr)
  3. Error Scenarios:

    • nonexistentcommand (command not found)
    • cat /root/protected (permission denied)
    • Long-running command interruption
  4. Cross-Platform:

    • Test on Windows, macOS, and Linux
    • Verify shell detection works correctly
    • Check path handling differences

8.7.2 Performance Tests

  • Large Output: Commands generating >100MB output
  • Rapid Output: Commands with high-frequency output
  • Memory Usage: Monitor memory consumption during long sessions
  • UI Responsiveness: Ensure UI remains responsive during command execution

8.8 Success Criteria

8.8.1 Functional Requirements

Users can execute shell commands through chat interface
Command output streams in real-time to chat bubbles
Command history navigation works with arrow keys
Cross-platform compatibility (Windows/macOS/Linux)
Process interruption works reliably

8.8.2 Performance Requirements

Command execution starts within 100ms of user sending
Output streaming latency < 200ms
UI remains responsive with outputs up to 10MB
Memory usage remains stable during extended use

8.8.3 User Experience Requirements

Chat interface feels natural and intuitive
Clear visual distinction between commands and output
Loading indicators provide appropriate feedback
Auto-scroll behavior works as expected

8.9 Implementation Timeline

Phase 1: Core Infrastructure (Day 1)

  • Set up POC page structure and routing
  • Implement basic IPC communication
  • Create simple command execution in main process

Phase 2: Basic UI (Day 2)

  • Build message display components
  • Implement command input with history
  • Add basic styling and layout

Phase 3: Streaming & Polish (Day 3)

  • Implement real-time output streaming
  • Add loading states and status indicators
  • Test cross-platform compatibility

Phase 4: Testing & Refinement (Day 4)

  • Comprehensive manual testing
  • Performance optimization
  • Bug fixes and UX improvements

Total Estimated Time: 4 days

8.10 Migration Path to Production

The POC provides a foundation for the full production implementation:

  1. Component Reusability: POC components can be enhanced rather than rewritten
  2. Architecture Validation: IPC patterns proven in POC extend to production
  3. User Feedback: POC enables early user testing and feedback collection
  4. Performance Baseline: POC establishes performance expectations
  5. Cross-platform Foundation: Platform compatibility issues resolved early

This PRD provides a focused scope for implementing a robust AI Agent command interface that enhances Cherry Studio's development capabilities through natural language interaction, while maintaining high standards for both technical implementation and user experience design.