8ab26e4e45
- Create new AgentExecutionService.ts with secure agent.py script execution - Replace arbitrary shell command execution with controlled Python script calls - Add claude_session_id field to session types for conversation continuity - Update shared types between main and renderer processes - Implement proper argument validation and sanitization - Add comprehensive error handling and logging - Export service through agent service index Security improvements: - Only executes predefined agent.py script (no arbitrary commands) - Uses direct process spawning instead of shell execution - Validates all arguments before execution - Prevents command injection vulnerabilities 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
137 lines
8.5 KiB
Markdown
137 lines
8.5 KiB
Markdown
# Agent Service Refactoring Plan
|
|
|
|
## Objective
|
|
|
|
The goal is to completely rewrite the agent execution flow for both backend (`src/main/services/agent/`) and frontend (`src/renderer/src/pages/cherry-agent/`). We will move from a model that can run any arbitrary shell command to a more secure and specialized model that **only** executes the `agent.py` script to process user prompts. This ensures that user input is always treated as data for the agent, not as a command to be executed by the shell.
|
|
|
|
@agent.py is the agent script file
|
|
@agent.log is an example output of the agent execute.
|
|
|
|
## High-Level Plan
|
|
|
|
The complete rewrite will involve these key areas:
|
|
|
|
1. **Introduce a dedicated `AgentExecutionService`:** This new service on the main process will be the single point of control for running the Python agent.
|
|
2. **Secure the Command Executor:** We will modify the existing `commandExecutor.ts` to prevent shell injection vulnerabilities by no longer using a shell to wrap the command.
|
|
3. **Update Session Management:** The database schema and logic will be updated to handle the `session_id` generated by `agent.py`, allowing for conversation continuity.
|
|
4. **Rewrite Frontend Components:** All UI components will be updated to work with the new prompt-based flow instead of command execution.
|
|
5. **Adapt IPC & Communication:** The communication between the renderer and the main process will be updated to pass prompts instead of raw commands.
|
|
|
|
---
|
|
|
|
## Detailed Implementation Steps
|
|
|
|
### 1. Backend Refactoring (`src/main/services/agent`)
|
|
|
|
#### A. Create `AgentExecutionService.ts`
|
|
|
|
This new service will orchestrate the agent's execution.
|
|
|
|
- **File:** `src/main/services/agent/AgentExecutionService.ts`
|
|
- **Purpose:** To bridge the gap between incoming user prompts and the execution of the `agent.py` script.
|
|
- **Key Method:** `public async runAgent(sessionId: string, prompt: string): Promise<void>`
|
|
- This method will use `AgentService` to fetch the session and its associated agent details (instructions, working directory, etc.).
|
|
- It will determine the path to the `python` executable and the `agent.py` script. The path to `agent.py` should be a constant relative to the application root to prevent security issues.
|
|
- It will construct the argument list for `agent.py` based on the fetched data:
|
|
- `--prompt`: The user's input `prompt`.
|
|
- `--system-prompt`: The agent's `instructions`.
|
|
- `--cwd`: The session's `accessible_paths[0]`.
|
|
- `--session-id`: The `claude_session_id` stored in our session record (more on this in step 3). If it's the first turn, this argument is omitted.
|
|
- It will then call the refactored `pocCommandExecutor` to run the script.
|
|
- It will be responsible for parsing the `stdout` of the script on the first run to capture the newly created `claude_session_id` and update the database.
|
|
|
|
#### B. Refactor `commandExecutor.ts`
|
|
|
|
To enhance security, we will change how commands are executed.
|
|
|
|
- **File:** `src/main/services/agent/commandExecutor.ts`
|
|
- **Change:** Modify `executeCommand` to avoid using a shell (`bash -c`, `cmd /c`).
|
|
- **New Signature (suggestion):** `executeCommand(id: string, executable: string, args: string[], workingDirectory: string)`
|
|
- **Implementation:**
|
|
- The `spawn` function from `child_process` will be called directly with the executable and its arguments: `spawn(executable, args, { cwd: workingDirectory, ... })`.
|
|
- This completely bypasses the shell, eliminating the risk of command injection from the arguments. The `getShellCommand` method will no longer be needed for this workflow.
|
|
|
|
#### C. Update IPC Handling (`src/main/index.ts`)
|
|
|
|
Communication from the frontend needs to be adapted.
|
|
|
|
- **Action:** Create a new, dedicated IPC channel, for example, `IpcChannel.Agent_Run`.
|
|
- **Payload:** This channel will accept a structured object: `{ sessionId: string, prompt: string }`.
|
|
- **Handler:** The main process handler for this channel will simply call `agentExecutionService.runAgent(sessionId, prompt)`. The existing `IpcChannel.Poc_CommandOutput` can be reused to stream the log output back to the UI.
|
|
|
|
### 2. Database and Data Model Changes
|
|
|
|
To manage the lifecycle of agent conversations, we need to track the session ID from `agent.py`.
|
|
|
|
- **File:** `src/main/services/agent/queries.ts`
|
|
- **Action:** Add a new nullable field `claude_session_id TEXT` to the `sessions` table schema.
|
|
|
|
- **File:** `src/main/services/agent/types.ts`
|
|
- **Action:** Add the optional `claude_session_id?: string` field to the `SessionEntity` and `SessionResponse` interfaces.
|
|
|
|
- **File:** `src/main/services/agent/AgentService.ts`
|
|
- **Action:** Update the `createSession`, `updateSession`, and `getSessionById` methods to handle the new `claude_session_id` field.
|
|
- Add a new method like `updateSessionClaudeId(sessionId: string, claudeSessionId: string)` to be called by the `AgentExecutionService`.
|
|
|
|
### 3. Frontend Refactoring (`src/renderer`)
|
|
|
|
Finally, we'll update the UI to send prompts instead of commands.
|
|
|
|
- **File:** `src/renderer/src/hooks/usePocCommand.ts` (to be renamed/refactored as `useAgentCommand.ts`)
|
|
- **Action:** Complete rewrite of the command execution logic. Instead of sending a command string, it will now invoke the new IPC channel: `window.api.agent.run(sessionId, prompt)`.
|
|
- **New Interface:** The hook will expose methods for prompt submission rather than command execution.
|
|
|
|
- **File:** `src/renderer/src/pages/cherry-agent/CherryAgentPage.tsx`
|
|
- **Action:** Rewrite the main page component to work with prompt-based flow.
|
|
- The text from the command input will now be treated as the `prompt`.
|
|
- The function will call the refactored hook with the current session ID and the prompt: `agentCommandHook.run(agentManagement.currentSession.id, prompt)`.
|
|
- The `workingDirectory` will no longer be passed from the frontend, as it's now part of the session data managed by the backend.
|
|
|
|
- **Component Updates:** All components in `src/renderer/src/pages/cherry-agent/components/` will need updates:
|
|
- **`EnhancedCommandInput.tsx`:** Rename to `EnhancedPromptInput.tsx` and update to handle prompt submission instead of command execution.
|
|
- **`PocMessageBubble.tsx` and `PocMessageList.tsx`:** Update to display prompt/response pairs instead of command/output pairs.
|
|
- **Session management components:** Update to work with new session schema including `claude_session_id`.
|
|
|
|
## New Data Flow
|
|
|
|
The execution flow will be transformed as follows:
|
|
|
|
- **Before:**
|
|
`UI Input -> (command string) -> IPC -> ShellCommandExecutor -> Spawns Shell -> Executes Command`
|
|
|
|
- **After:**
|
|
`UI Input -> (prompt string) -> IPC({sessionId, prompt}) -> AgentExecutionService -> Constructs Args -> commandExecutor -> Spawns 'python' with args -> Executes agent.py`
|
|
|
|
## Security & Error Handling Improvements
|
|
|
|
### Security Enhancements
|
|
- **Path validation**: Ensure `agent.py` path is validated and cannot be manipulated
|
|
- **Argument sanitization**: Validate all arguments passed to `agent.py` to prevent injection
|
|
- **No shell execution**: Direct process spawning eliminates shell injection vulnerabilities
|
|
- **Resource limits**: Consider implementing timeout and resource constraints for agent processes
|
|
|
|
### Error Handling & Recovery
|
|
- **Agent script validation**: Verify `agent.py` exists and is accessible before execution
|
|
- **Process monitoring**: Handle agent crashes, timeouts, and unexpected terminations
|
|
- **Session recovery**: Graceful handling of orphaned sessions and Claude session mismatches
|
|
- **Structured error responses**: Clear error messaging for different failure scenarios
|
|
|
|
### Observability
|
|
- **Structured logging**: Comprehensive logging throughout the agent execution pipeline
|
|
- **Performance tracking**: Monitor agent execution times and resource usage
|
|
- **Health checks**: Periodic validation of agent system functionality
|
|
|
|
## Migration Strategy
|
|
|
|
### Backward Compatibility
|
|
- **Database migration**: Handle existing sessions without `claude_session_id`
|
|
- **Component migration**: Gradual update of UI components to new prompt-based interface
|
|
- **Testing strategy**: Comprehensive testing of both old and new flows during transition
|
|
|
|
### Rollout Plan
|
|
1. **Backend first**: Implement new `AgentExecutionService` with feature flag
|
|
2. **Database schema**: Add `claude_session_id` field with migration
|
|
3. **Frontend components**: Update components one by one
|
|
4. **IPC integration**: Connect new frontend to new backend
|
|
5. **Cleanup**: Remove old command execution code once migration is complete
|