Files

T

Vaayne 75766dbfdc feat: Add POC command page structure and routing

- Created CommandPocPage.tsx with basic layout structure
- Added POC-specific TypeScript interfaces and types
- Implemented basic UI components: PocHeader, PocMessageList, PocMessageBubble, PocCommandInput, PocStatusBar
- Added /command-poc route to Router.tsx
- Set up component folder structure following PRD specifications

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-03 11:43:02 +08:00

23 KiB

Raw Permalink Blame History

Product Requirements Document (PRD)

Cherry Studio AI Agent Command Interface

1. Overview

Product Name: Cherry Studio AI Agent Command Interface
Version: 1.0
Date: July 30, 2025

Vision: Create a conversational AI Agent interface in Cherry Studio that enables users to execute shell commands through natural language interaction, with seamless communication between the renderer and main processes, providing an intelligent command execution experience.

2. Scope & Objectives

This PRD focuses on two core areas:

2.1 Core Implementation Scope

Renderer ↔ Main Process Communication: Robust IPC communication for command execution
Shell Command Execution: Safe and efficient shell command processing in the main process
Real-time Output Streaming: Live command output display integrated into chat interface
AI Agent Integration: Natural language command interpretation and execution workflow

2.2 UI/UX Design Scope

Conversational Interface Design: Chat-like UI that fits Cherry Studio's design language
Command Agent Experience: AI-powered command interpretation and execution feedback
Interactive Output Display: Rich formatting of command results within chat messages
Responsive Design: Consistent chat experience across different window sizes and layouts

3. Technical Requirements

3.1 Core Implementation Requirements

3.1.1 IPC Communication Architecture

Requirement: Establish bidirectional communication between renderer and main processes for AI Agent command execution

Technical Specifications:

Agent Command Request Flow: Renderer → Main Process

interface AgentCommandRequest {
  id: string
  messageId: string  // Chat message ID for correlation
  command: string
  workingDirectory?: string
  timeout?: number
  environment?: Record<string, string>
  context?: string   // Additional context from chat conversation
}

Agent Output Streaming Flow: Main Process → Renderer

interface AgentCommandOutput {
  id: string
  messageId: string  // Chat message ID for correlation
  type: 'stdout' | 'stderr' | 'exit' | 'error' | 'progress'
  data: string
  exitCode?: number
  timestamp: number
}

IPC Channel Names:
- agent-command-execute (Renderer → Main)
- agent-command-output (Main → Renderer)
- agent-command-interrupt (Renderer → Main)

3.1.2 Main Process Agent Command Service

Requirement: Create a new AgentCommandService in the main process

Technical Specifications:

Service Location: src/main/services/AgentCommandService.ts

Core Methods:

class AgentCommandService {
  executeCommand(request: AgentCommandRequest): Promise<void>
  interruptCommand(commandId: string): Promise<void>
  getRunningCommands(): string[]
  setWorkingDirectory(path: string): void
  formatCommandOutput(output: string, type: string): string
}

Process Management:
- Use Node.js child_process.spawn() for command execution
- Support real-time stdout/stderr streaming to chat interface
- Handle process interruption via chat commands
- Maintain working directory state per agent session
- Format output for better chat display (tables, JSON, etc.)
Error Handling:
- Command not found errors with helpful suggestions
- Permission denied errors with explanations
- Timeout handling with progress updates
- Process termination with cleanup notifications

3.1.3 Renderer Process Integration

Requirement: Implement AI Agent command functionality in the renderer process

Technical Specifications:

Service Location: src/renderer/src/services/AgentCommandService.ts
Component Integration: Agent chat page and command execution components
State Management: Chat session state, command history, output formatting
Message Correlation: Link command outputs to specific chat messages

3.2 Performance Requirements

Command Response Time: < 100ms for command initiation
Output Streaming Latency: < 50ms for real-time output display
Memory Management: Efficient handling of large command outputs (>10MB)
Concurrent Commands: Support up to 5 simultaneous command executions

3.3 Security Requirements

Command Validation: Basic validation for dangerous commands
Working Directory Restrictions: Respect file system permissions
Environment Variable Handling: Secure handling of environment variables
Process Isolation: Commands run with application user privileges

4. UI/UX Design Requirements

4.1 Design Principles

Target Audience: Senior Frontend and UI Designers
Design Goals: Create an intuitive, conversational AI Agent interface that enhances developer productivity through natural language command execution

4.1.1 Visual Design Requirements

Design System Integration: Follow Cherry Studio's existing chat design patterns
Theme Support: Light/dark theme compatibility
Typography: Mix of regular chat font and monospace for command outputs
Color Scheme: Distinct styling for user messages, agent responses, and command outputs
Message Bubbles: Clear visual distinction between conversation and command execution

4.1.2 Layout Requirements

Primary Layout Structure (Chat Interface):

┌─────────────────────────────────────┐
│ Agent Header (name + status + controls) │
├─────────────────────────────────────┤
│                                     │
│        Chat Messages Area           │
│     (user messages + agent replies  │
│      + command outputs)             │
│                                     │
├─────────────────────────────────────┤
│ Message Input (natural language)    │
└─────────────────────────────────────┘

Responsive Considerations:

Minimum width: 320px (mobile)
Optimal width: 600-800px (desktop)
Message bubbles adapt to content width
Command outputs can expand full width

4.1.3 Component Specifications

Agent Header Component:

Agent name and avatar
Working directory indicator
Active command status (running/idle)
Session controls (clear chat, export logs)

Chat Messages Component:

User Messages: Standard chat bubbles for natural language input
Agent Responses: AI responses explaining commands or asking for clarification
Command Execution Messages: Special formatting for:
- Command being executed (with syntax highlighting)
- Real-time output streaming (scrollable, copyable)
- Execution status (success/error/interrupted)
- Formatted results (tables, JSON, file listings)

Message Input Component:

Natural language input field
Send button with loading state during command execution
Suggestion chips for common requests
Support for follow-up questions and command modifications

4.2 User Experience Requirements

4.2.1 Interaction Patterns

Conversational Flow:

User types natural language requests ("list files in src directory")
Agent interprets and confirms command before execution
Real-time command output appears in chat
User can ask follow-up questions or modify commands

Keyboard Shortcuts:

Enter: Send message/command
Ctrl+Enter: Force command execution without confirmation
Ctrl+K: Interrupt running command
Ctrl+L: Clear chat history
↑/↓: Navigate message input history

Mouse Interactions:

Click on command outputs to copy
Click on file paths to open in Cherry Studio
Hover over commands for quick actions (copy, re-run, modify)

4.2.2 Feedback & Status Indicators

Visual Feedback Requirements:

Agent Thinking: Typing indicator while processing user request
Command Execution: Progress indicator and real-time output streaming
Execution Status: Success/error/warning indicators in message bubbles
Working Directory: Persistent display in agent header
Command History: Visual indication of previous commands in chat

4.2.3 Accessibility Requirements

Keyboard Navigation: Full chat functionality accessible via keyboard
Screen Reader Support: Proper ARIA labels for chat messages and command outputs
High Contrast: Support for high contrast themes in all message types
Focus Management: Logical tab order through chat interface

4.3 Advanced UX Features (Future Considerations)

Command Suggestions: AI-powered suggestions based on current context
Smart Output Formatting: Automatic formatting for JSON, tables, logs, etc.
File Integration: Deep integration with Cherry Studio's file management
Session Memory: Agent remembers context across chat sessions
Multi-step Workflows: Support for complex, multi-command operations

5. Implementation Approach

5.1 Development Phases

Phase 1: Core Infrastructure (2-3 weeks)

Implement AgentCommandService in main process
Establish IPC communication for chat-command flow
Basic command execution and output streaming to chat interface

Phase 2: AI Agent Chat Interface (3-4 weeks)

Design and implement conversational chat components
Create command execution message types and formatting
Integrate natural language command interpretation
Implement real-time output streaming in chat bubbles

Phase 3: Enhanced Agent Features (2-3 weeks)

Add command confirmation and clarification flows
Implement smart output formatting (tables, JSON, etc.)
Add working directory management in chat context
Integrate with Cherry Studio's existing AI infrastructure

5.2 Integration Points

Router Integration: Add /agent or /command-agent route to src/renderer/src/Router.tsx
Navigation: Add agent icon to Cherry Studio's main navigation
AI Core Integration: Leverage existing AI infrastructure for command interpretation
Settings Integration: Agent preferences in application settings
Chat System: Reuse existing chat components and patterns from Cherry Studio

6. Success Metrics

6.1 Technical Metrics

Command execution success rate: >99%
Average command response time: <100ms
Output streaming latency: <50ms
Zero memory leaks during extended usage

6.2 User Experience Metrics

User adoption rate within first month
Average chat session duration
Natural language command interpretation accuracy
Command execution success rate through conversational interface
User feedback scores on AI Agent usability and helpfulness

7. Dependencies & Constraints

7.1 Technical Dependencies

Node.js child_process module
Electron IPC capabilities
Cherry Studio's existing service architecture
React/TypeScript frontend stack
Cherry Studio's AI Core infrastructure
Existing chat components and design system

7.2 Platform Constraints

Cross-platform compatibility (Windows, macOS, Linux)
Shell availability on target platforms
File system permission handling

8. Proof of Concept (POC) Implementation

8.1 POC Objectives

Primary Goal: Validate the core concept of chat-based command execution with minimal implementation complexity.

Key Validation Points:

User experience of command execution through chat interface
Technical feasibility of IPC communication for real-time output streaming
Performance characteristics of command output display in chat bubbles
Cross-platform compatibility of basic shell command execution

8.2 POC Scope & Limitations

8.2.1 Included Features

✅ Direct Command Execution: Users type shell commands directly (no AI interpretation)
✅ Real-time Output Streaming: Command output appears live in chat bubbles
✅ Basic Chat Interface: Simple message list with input field
✅ Command History: Navigate previous commands with arrow keys
✅ Cross-platform Support: Works on Windows, macOS, and Linux
✅ Process Management: Start/stop command execution

8.2.2 Excluded Features (Future Work)

❌ AI natural language interpretation of commands
❌ Command confirmation or clarification flows
❌ Advanced output formatting (tables, JSON highlighting)
❌ Security validation and command filtering
❌ Session persistence between app restarts
❌ Multiple concurrent command execution
❌ Working directory management UI
❌ Integration with Cherry Studio's AI core

8.3 Technical Architecture

8.3.1 Component Structure

src/renderer/src/pages/command-poc/
├── CommandPocPage.tsx              # Main container component
├── components/
│   ├── PocHeader.tsx              # Header with working directory
│   ├── PocMessageList.tsx         # Scrollable message container
│   ├── PocMessageBubble.tsx       # Individual message display
│   ├── PocCommandInput.tsx        # Command input with history
│   └── PocStatusBar.tsx           # Command execution status
├── hooks/
│   ├── usePocMessages.ts          # Message state management
│   ├── usePocCommand.ts           # Command execution logic
│   └── useCommandHistory.ts       # Input history navigation
└── types.ts                       # POC-specific TypeScript interfaces

8.3.2 Data Structures

interface PocMessage {
  id: string
  type: 'user-command' | 'output' | 'error' | 'system'
  content: string
  timestamp: number
  commandId?: string  // Links output to originating command
  isComplete: boolean // For streaming messages
}

interface PocCommandExecution {
  id: string
  command: string
  startTime: number
  endTime?: number
  exitCode?: number
  isRunning: boolean
}

8.3.3 IPC Communication

// Renderer → Main Process
interface PocExecuteCommandRequest {
  id: string
  command: string
  workingDirectory: string
}

// Main Process → Renderer
interface PocCommandOutput {
  commandId: string
  type: 'stdout' | 'stderr' | 'exit' | 'error'
  data: string
  exitCode?: number
}

// IPC Channels
const IPC_CHANNELS = {
  EXECUTE_COMMAND: 'poc-execute-command',
  COMMAND_OUTPUT: 'poc-command-output',
  INTERRUPT_COMMAND: 'poc-interrupt-command'
}

8.4 Implementation Details

8.4.1 Main Process Implementation

File: src/main/poc/commandExecutor.ts

class PocCommandExecutor {
  private activeProcesses = new Map<string, ChildProcess>()
  
  executeCommand(request: PocExecuteCommandRequest) {
    const { spawn } = require('child_process')
    const shell = process.platform === 'win32' ? 'cmd' : 'bash'
    const args = process.platform === 'win32' ? ['/c'] : ['-c']
    
    const child = spawn(shell, [...args, request.command], {
      cwd: request.workingDirectory
    })
    
    this.activeProcesses.set(request.id, child)
    
    // Stream output handling
    child.stdout.on('data', (data) => {
      this.sendOutput(request.id, 'stdout', data.toString())
    })
    
    child.stderr.on('data', (data) => {
      this.sendOutput(request.id, 'stderr', data.toString())
    })
    
    child.on('close', (code) => {
      this.sendOutput(request.id, 'exit', '', code)
      this.activeProcesses.delete(request.id)
    })
  }
}

8.4.2 Renderer Process Implementation

State Management Strategy:

const usePocMessages = () => {
  const [messages, setMessages] = useState<PocMessage[]>([])
  const [activeCommand, setActiveCommand] = useState<string | null>(null)
  
  const addUserCommand = (command: string) => {
    const commandMessage: PocMessage = {
      id: uuid(),
      type: 'user-command',
      content: command,
      timestamp: Date.now(),
      isComplete: true
    }
    
    const outputMessage: PocMessage = {
      id: uuid(),
      type: 'output',
      content: '',
      timestamp: Date.now(),
      commandId: commandMessage.id,
      isComplete: false
    }
    
    setMessages(prev => [...prev, commandMessage, outputMessage])
    return outputMessage.id
  }
  
  const appendOutput = (messageId: string, data: string) => {
    setMessages(prev => prev.map(msg => 
      msg.id === messageId 
        ? { ...msg, content: msg.content + data }
        : msg
    ))
  }
}

Output Streaming with Buffering:

const useOutputBuffer = () => {
  const bufferRef = useRef<string>('')
  const timeoutRef = useRef<NodeJS.Timeout>()
  
  const bufferOutput = (data: string, messageId: string) => {
    bufferRef.current += data
    
    clearTimeout(timeoutRef.current)
    timeoutRef.current = setTimeout(() => {
      appendOutput(messageId, bufferRef.current)
      bufferRef.current = ''
    }, 100) // 100ms debounce
  }
}

8.4.3 UI Components

Message Bubble Component:

const PocMessageBubble: React.FC<{ message: PocMessage }> = ({ message }) => {
  const isUserCommand = message.type === 'user-command'
  
  return (
    <MessageContainer isUser={isUserCommand}>
      {isUserCommand ? (
        <CommandBubble>
          <CommandPrefix>$</CommandPrefix>
          <CommandText>{message.content}</CommandText>
        </CommandBubble>
      ) : (
        <OutputBubble>
          <pre>{message.content}</pre>
          {!message.isComplete && <LoadingDots />}
        </OutputBubble>
      )}
    </MessageContainer>
  )
}

Command Input with History:

const PocCommandInput: React.FC = ({ onSendCommand }) => {
  const [input, setInput] = useState('')
  const { history, addToHistory, navigateHistory } = useCommandHistory()
  
  const handleKeyDown = (e: React.KeyboardEvent) => {
    switch (e.key) {
      case 'Enter':
        if (input.trim()) {
          onSendCommand(input.trim())
          addToHistory(input.trim())
          setInput('')
        }
        break
      case 'ArrowUp':
        e.preventDefault()
        setInput(navigateHistory('up'))
        break
      case 'ArrowDown':
        e.preventDefault()
        setInput(navigateHistory('down'))
        break
    }
  }
}

8.5 Cross-Platform Considerations

8.5.1 Shell Detection

const getShellConfig = () => {
  switch (process.platform) {
    case 'win32':
      return { shell: 'cmd', args: ['/c'] }
    case 'darwin':
    case 'linux':
      return { shell: 'bash', args: ['-c'] }
    default:
      return { shell: 'sh', args: ['-c'] }
  }
}

8.5.2 Path Handling

const normalizeWorkingDirectory = (path: string) => {
  return process.platform === 'win32' 
    ? path.replace(/\//g, '\\')
    : path.replace(/\\/g, '/')
}

8.6 Performance Optimizations

8.6.1 Virtual Scrolling

const PocMessageList: React.FC = ({ messages }) => {
  const [visibleRange, setVisibleRange] = useState({ start: 0, end: 50 })
  
  // Only render visible messages for large message lists
  const visibleMessages = messages.slice(
    visibleRange.start, 
    visibleRange.end
  )
  
  return (
    <VirtualScrollContainer onScroll={handleScroll}>
      {visibleMessages.map(message => (
        <PocMessageBubble key={message.id} message={message} />
      ))}
    </VirtualScrollContainer>
  )
}

8.6.2 Output Truncation

const MAX_OUTPUT_LENGTH = 1024 * 1024 // 1MB per message
const MAX_TOTAL_MESSAGES = 1000

const truncateIfNeeded = (content: string) => {
  if (content.length > MAX_OUTPUT_LENGTH) {
    return content.slice(0, MAX_OUTPUT_LENGTH) + '\n\n[Output truncated...]'
  }
  return content
}

8.7 Testing Strategy

8.7.1 Manual Test Cases

Basic Commands:
- ls -la / dir (directory listing)
- pwd / cd (working directory)
- echo "Hello World" (simple output)
Streaming Output:
- ping google.com -c 5 (timed output)
- find . -name "*.ts" (large output)
- npm install (mixed stdout/stderr)
Error Scenarios:
- nonexistentcommand (command not found)
- cat /root/protected (permission denied)
- Long-running command interruption
Cross-Platform:
- Test on Windows, macOS, and Linux
- Verify shell detection works correctly
- Check path handling differences

8.7.2 Performance Tests

Large Output: Commands generating >100MB output
Rapid Output: Commands with high-frequency output
Memory Usage: Monitor memory consumption during long sessions
UI Responsiveness: Ensure UI remains responsive during command execution

8.8 Success Criteria

8.8.1 Functional Requirements

✅ Users can execute shell commands through chat interface
✅ Command output streams in real-time to chat bubbles
✅ Command history navigation works with arrow keys
✅ Cross-platform compatibility (Windows/macOS/Linux)
✅ Process interruption works reliably

8.8.2 Performance Requirements

✅ Command execution starts within 100ms of user sending
✅ Output streaming latency < 200ms
✅ UI remains responsive with outputs up to 10MB
✅ Memory usage remains stable during extended use

8.8.3 User Experience Requirements

✅ Chat interface feels natural and intuitive
✅ Clear visual distinction between commands and output
✅ Loading indicators provide appropriate feedback
✅ Auto-scroll behavior works as expected

8.9 Implementation Timeline

Phase 1: Core Infrastructure (Day 1)

Set up POC page structure and routing
Implement basic IPC communication
Create simple command execution in main process

Phase 2: Basic UI (Day 2)

Build message display components
Implement command input with history
Add basic styling and layout

Phase 3: Streaming & Polish (Day 3)

Implement real-time output streaming
Add loading states and status indicators
Test cross-platform compatibility

Phase 4: Testing & Refinement (Day 4)

Comprehensive manual testing
Performance optimization
Bug fixes and UX improvements

Total Estimated Time: 4 days

8.10 Migration Path to Production

The POC provides a foundation for the full production implementation:

Component Reusability: POC components can be enhanced rather than rewritten
Architecture Validation: IPC patterns proven in POC extend to production
User Feedback: POC enables early user testing and feedback collection
Performance Baseline: POC establishes performance expectations
Cross-platform Foundation: Platform compatibility issues resolved early

This PRD provides a focused scope for implementing a robust AI Agent command interface that enhances Cherry Studio's development capabilities through natural language interaction, while maintaining high standards for both technical implementation and user experience design.

23 KiB Raw Permalink Blame History

Product Requirements Document (PRD)

Cherry Studio AI Agent Command Interface

1. Overview

2. Scope & Objectives

2.1 Core Implementation Scope

2.2 UI/UX Design Scope

3. Technical Requirements

3.1 Core Implementation Requirements

3.1.1 IPC Communication Architecture

3.1.2 Main Process Agent Command Service

3.1.3 Renderer Process Integration

3.2 Performance Requirements

3.3 Security Requirements

4. UI/UX Design Requirements

4.1 Design Principles

4.1.1 Visual Design Requirements

4.1.2 Layout Requirements

4.1.3 Component Specifications

4.2 User Experience Requirements

4.2.1 Interaction Patterns

4.2.2 Feedback & Status Indicators

4.2.3 Accessibility Requirements

4.3 Advanced UX Features (Future Considerations)

5. Implementation Approach

5.1 Development Phases

5.2 Integration Points

6. Success Metrics

6.1 Technical Metrics

6.2 User Experience Metrics

7. Dependencies & Constraints

7.1 Technical Dependencies

7.2 Platform Constraints

8. Proof of Concept (POC) Implementation

8.1 POC Objectives

8.2 POC Scope & Limitations

8.2.1 Included Features

8.2.2 Excluded Features (Future Work)

8.3 Technical Architecture

8.3.1 Component Structure

8.3.2 Data Structures

8.3.3 IPC Communication

8.4 Implementation Details

8.4.1 Main Process Implementation

8.4.2 Renderer Process Implementation

8.4.3 UI Components

8.5 Cross-Platform Considerations

8.5.1 Shell Detection

8.5.2 Path Handling

8.6 Performance Optimizations

8.6.1 Virtual Scrolling

8.6.2 Output Truncation

8.7 Testing Strategy

8.7.1 Manual Test Cases

8.7.2 Performance Tests

8.8 Success Criteria

8.8.1 Functional Requirements

8.8.2 Performance Requirements

8.8.3 User Experience Requirements

8.9 Implementation Timeline

8.10 Migration Path to Production

23 KiB

Raw Permalink Blame History