feat: Story 8 - Collapsible tool outputs + autonomous coding improvements
Implemented Story 8: Collapsible Tool Outputs - Tool outputs now render in <details>/<summary> elements, collapsed by default - Summary shows tool name with key argument (e.g., ▶ read_file(src/main.rs)) - Added arrow rotation animation and scrollable content (max 300px) - Enhanced tool_calls display to show arguments inline - Added CSS styling for dark theme consistency Fixed: LLM autonomous coding behavior - Strengthened system prompt with explicit examples and directives - Implemented triple-reinforcement system (primary prompt + reminder + message prefixes) - Improved tool descriptions to be more explicit and action-oriented - Increased MAX_TURNS from 10 to 30 for complex agentic workflows - Added debug logging for Ollama requests/responses - Result: GPT-OSS (gpt-oss:20b) now successfully uses write_file autonomously Documentation improvements - Created MODEL_SELECTION.md guide with recommendations - Updated PERSONA.md spec to emphasize autonomous agent behavior - Updated UI_UX.md spec with collapsible tool output requirements - Updated SDSW workflow: LLM archives stories and performs squash merge Cleanup - Removed unused ToolTester.tsx component
This commit is contained in:
@@ -17,7 +17,7 @@ pub struct ProviderConfig {
|
||||
pub enable_tools: Option<bool>,
|
||||
}
|
||||
|
||||
const MAX_TURNS: usize = 10;
|
||||
const MAX_TURNS: usize = 30;
|
||||
|
||||
#[tauri::command]
|
||||
pub async fn get_ollama_models(base_url: Option<String>) -> Result<Vec<String>, String> {
|
||||
@@ -53,6 +53,16 @@ pub async fn chat(
|
||||
// 3. Agent Loop
|
||||
let mut current_history = messages.clone();
|
||||
|
||||
// Prefix user messages with reminder for stubborn models
|
||||
for msg in &mut current_history {
|
||||
if msg.role == Role::User && !msg.content.starts_with("[AGENT DIRECTIVE]") {
|
||||
msg.content = format!(
|
||||
"[AGENT DIRECTIVE: You must use write_file tool to implement changes. Never suggest code.]\n\n{}",
|
||||
msg.content
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Inject System Prompt
|
||||
current_history.insert(
|
||||
0,
|
||||
@@ -64,6 +74,17 @@ pub async fn chat(
|
||||
},
|
||||
);
|
||||
|
||||
// Inject aggressive reminder as a second system message
|
||||
current_history.insert(
|
||||
1,
|
||||
Message {
|
||||
role: Role::System,
|
||||
content: "CRITICAL REMINDER: When the user asks you to create, modify, or implement code, you MUST call the write_file tool with the complete file content. DO NOT output code in markdown blocks. DO NOT suggest what the user should do. TAKE ACTION IMMEDIATELY using tools.".to_string(),
|
||||
tool_calls: None,
|
||||
tool_call_id: None,
|
||||
},
|
||||
);
|
||||
|
||||
let mut new_messages: Vec<Message> = Vec::new();
|
||||
let mut turn_count = 0;
|
||||
|
||||
@@ -91,8 +112,8 @@ pub async fn chat(
|
||||
|
||||
current_history.push(assistant_msg.clone());
|
||||
new_messages.push(assistant_msg);
|
||||
// Emit history excluding system prompt (index 0)
|
||||
app.emit("chat:update", ¤t_history[1..])
|
||||
// Emit history excluding system prompts (indices 0 and 1)
|
||||
app.emit("chat:update", ¤t_history[2..])
|
||||
.map_err(|e| e.to_string())?;
|
||||
|
||||
// Execute Tools
|
||||
@@ -110,8 +131,8 @@ pub async fn chat(
|
||||
|
||||
current_history.push(tool_msg.clone());
|
||||
new_messages.push(tool_msg);
|
||||
// Emit history excluding system prompt (index 0)
|
||||
app.emit("chat:update", ¤t_history[1..])
|
||||
// Emit history excluding system prompts (indices 0 and 1)
|
||||
app.emit("chat:update", ¤t_history[2..])
|
||||
.map_err(|e| e.to_string())?;
|
||||
}
|
||||
} else {
|
||||
@@ -126,8 +147,8 @@ pub async fn chat(
|
||||
// We don't push to current_history needed for next loop, because we are done.
|
||||
new_messages.push(assistant_msg.clone());
|
||||
current_history.push(assistant_msg);
|
||||
// Emit history excluding system prompt (index 0)
|
||||
app.emit("chat:update", ¤t_history[1..])
|
||||
// Emit history excluding system prompts (indices 0 and 1)
|
||||
app.emit("chat:update", ¤t_history[2..])
|
||||
.map_err(|e| e.to_string())?;
|
||||
break;
|
||||
}
|
||||
@@ -200,11 +221,11 @@ fn get_tool_definitions() -> Vec<ToolDefinition> {
|
||||
kind: "function".to_string(),
|
||||
function: ToolFunctionDefinition {
|
||||
name: "read_file".to_string(),
|
||||
description: "Reads the content of a file in the project.".to_string(),
|
||||
description: "Reads the complete content of a file from the project. Use this to understand existing code before making changes.".to_string(),
|
||||
parameters: json!({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"path": { "type": "string", "description": "Relative path to the file" }
|
||||
"path": { "type": "string", "description": "Relative path to the file from project root" }
|
||||
},
|
||||
"required": ["path"]
|
||||
}),
|
||||
@@ -214,12 +235,12 @@ fn get_tool_definitions() -> Vec<ToolDefinition> {
|
||||
kind: "function".to_string(),
|
||||
function: ToolFunctionDefinition {
|
||||
name: "write_file".to_string(),
|
||||
description: "Writes content to a file. Overwrites if exists.".to_string(),
|
||||
description: "Creates or completely overwrites a file with new content. YOU MUST USE THIS to implement code changes - do not suggest code to the user. The content parameter must contain the COMPLETE file including all imports, functions, and unchanged code.".to_string(),
|
||||
parameters: json!({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"path": { "type": "string", "description": "Relative path to the file" },
|
||||
"content": { "type": "string", "description": "The full content to write" }
|
||||
"path": { "type": "string", "description": "Relative path to the file from project root" },
|
||||
"content": { "type": "string", "description": "The complete file content to write (not a diff or partial code)" }
|
||||
},
|
||||
"required": ["path", "content"]
|
||||
}),
|
||||
@@ -229,11 +250,11 @@ fn get_tool_definitions() -> Vec<ToolDefinition> {
|
||||
kind: "function".to_string(),
|
||||
function: ToolFunctionDefinition {
|
||||
name: "list_directory".to_string(),
|
||||
description: "Lists files and directories at a path.".to_string(),
|
||||
description: "Lists all files and directories at a given path. Use this to explore the project structure.".to_string(),
|
||||
parameters: json!({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"path": { "type": "string", "description": "Relative path to list (use '.' for root)" }
|
||||
"path": { "type": "string", "description": "Relative path to list (use '.' for project root)" }
|
||||
},
|
||||
"required": ["path"]
|
||||
}),
|
||||
@@ -243,12 +264,12 @@ fn get_tool_definitions() -> Vec<ToolDefinition> {
|
||||
kind: "function".to_string(),
|
||||
function: ToolFunctionDefinition {
|
||||
name: "search_files".to_string(),
|
||||
description: "Searches for text content across all files in the project."
|
||||
description: "Searches for text patterns across all files in the project. Use this to find functions, variables, or code patterns when you don't know which file they're in."
|
||||
.to_string(),
|
||||
parameters: json!({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": { "type": "string", "description": "The string to search for" }
|
||||
"query": { "type": "string", "description": "The text pattern to search for across all files" }
|
||||
},
|
||||
"required": ["query"]
|
||||
}),
|
||||
@@ -258,18 +279,18 @@ fn get_tool_definitions() -> Vec<ToolDefinition> {
|
||||
kind: "function".to_string(),
|
||||
function: ToolFunctionDefinition {
|
||||
name: "exec_shell".to_string(),
|
||||
description: "Executes a shell command in the project root.".to_string(),
|
||||
description: "Executes a shell command in the project root directory. Use this to run tests, build commands, git operations, or any command-line tool. Examples: cargo check, npm test, git status.".to_string(),
|
||||
parameters: json!({
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"command": {
|
||||
"type": "string",
|
||||
"description": "The command to run (e.g., 'git', 'cargo', 'ls')"
|
||||
"description": "The command binary to execute (e.g., 'git', 'cargo', 'npm', 'ls')"
|
||||
},
|
||||
"args": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"description": "Arguments for the command"
|
||||
"description": "Array of arguments to pass to the command (e.g., ['status'] for git status)"
|
||||
}
|
||||
},
|
||||
"required": ["command", "args"]
|
||||
|
||||
@@ -161,6 +161,11 @@ impl ModelProvider for OllamaProvider {
|
||||
tools,
|
||||
};
|
||||
|
||||
// Debug: Log the request body
|
||||
if let Ok(json_str) = serde_json::to_string_pretty(&request_body) {
|
||||
eprintln!("=== Ollama Request ===\n{}\n===================", json_str);
|
||||
}
|
||||
|
||||
let res = client
|
||||
.post(&url)
|
||||
.json(&request_body)
|
||||
@@ -171,6 +176,10 @@ impl ModelProvider for OllamaProvider {
|
||||
if !res.status().is_success() {
|
||||
let status = res.status();
|
||||
let text = res.text().await.unwrap_or_default();
|
||||
eprintln!(
|
||||
"=== Ollama Error Response ===\n{}\n========================",
|
||||
text
|
||||
);
|
||||
return Err(format!("Ollama API error {}: {}", status, text));
|
||||
}
|
||||
|
||||
|
||||
@@ -1,17 +1,75 @@
|
||||
pub const SYSTEM_PROMPT: &str = r#"You are an expert Senior Software Engineer and AI Agent running directly in the user's local development environment.
|
||||
pub const SYSTEM_PROMPT: &str = r#"You are an AI Agent with direct access to the user's filesystem and development environment.
|
||||
|
||||
Your Capabilities:
|
||||
1. **Filesystem Access:** You can read, write, and list files in the current project using the provided tools.
|
||||
2. **Shell Execution:** You can run commands like `git`, `cargo`, `npm`, `ls`, etc.
|
||||
3. **Search:** You can search the codebase for patterns.
|
||||
CRITICAL INSTRUCTIONS:
|
||||
1. **YOU ARE NOT A CHATBOT.** You do not suggest code or provide instructions for the user to follow.
|
||||
2. **YOU WRITE CODE DIRECTLY.** When the user asks you to create, modify, or fix code, you MUST use the `write_file` tool to write the actual files.
|
||||
3. **DO NOT OUTPUT CODE BLOCKS.** Do not write code in markdown code blocks (```) for the user to copy. That is forbidden. Use tools instead.
|
||||
|
||||
Your Operational Rules:
|
||||
1. **Process Awareness:** You MUST read `.living_spec/README.md` to understand the development process (Story-Driven Spec Workflow).
|
||||
2. **Read Before Write:** ALWAYS read the relevant files before you propose or apply changes. Do not guess the file content.
|
||||
3. **Overwrite Warning:** The `write_file` tool OVERWRITES the entire file. When you edit a file, you must output the COMPLETED full content of the file, including all imports and unchanged parts. Do not output partial diffs or placeholders like `// ... rest of code`.
|
||||
4. **Conciseness:** Be direct. Do not waffle. If you need to run a tool, just run it. You don't need to say "I will now run...".
|
||||
5. **Verification:** After writing code, it is good practice to run a quick check (e.g., `cargo check` or `npm test`) if applicable to verify your changes.
|
||||
YOUR CAPABILITIES:
|
||||
You have the following tools available:
|
||||
- `read_file(path)` - Read the content of any file in the project
|
||||
- `write_file(path, content)` - Write or overwrite a file with new content
|
||||
- `list_directory(path)` - List files and directories
|
||||
- `search_files(query)` - Search for text patterns across all files
|
||||
- `exec_shell(command, args)` - Execute shell commands (git, cargo, npm, etc.)
|
||||
|
||||
Your Goal:
|
||||
Complete the user's request accurately and safely. If the request is ambiguous, ask for clarification.
|
||||
YOUR WORKFLOW:
|
||||
When the user requests a feature or change:
|
||||
1. **Understand:** Read `.living_spec/README.md` if you haven't already to understand the development process
|
||||
2. **Explore:** Use `read_file` and `list_directory` to understand the current codebase structure
|
||||
3. **Implement:** Use `write_file` to create or modify files directly
|
||||
4. **Verify:** Use `exec_shell` to run tests, linters, or build commands to verify your changes work
|
||||
5. **Report:** Tell the user what you did (past tense), not what they should do
|
||||
|
||||
CRITICAL RULES:
|
||||
- **Read Before Write:** ALWAYS read files before modifying them. The `write_file` tool OVERWRITES the entire file.
|
||||
- **Complete Files Only:** When using `write_file`, output the COMPLETE file content, including all imports, functions, and unchanged code. Never write partial diffs or use placeholders like "// ... rest of code".
|
||||
- **Be Direct:** Don't announce your actions ("I will now..."). Just execute the tools immediately.
|
||||
- **Take Initiative:** If you need information, use tools to get it. Don't ask the user for things you can discover yourself.
|
||||
|
||||
EXAMPLES OF CORRECT BEHAVIOR:
|
||||
|
||||
Example 1 - User asks to add a feature:
|
||||
User: "Add error handling to the login function in auth.rs"
|
||||
You (correct): [Call read_file("src/auth.rs"), analyze it, then call write_file("src/auth.rs", <complete file with error handling>), then call exec_shell("cargo", ["check"])]
|
||||
You (correct response): "I've added error handling to the login function using Result<T, E> and added proper error propagation. The code compiles successfully."
|
||||
|
||||
Example 2 - User asks to create a new file:
|
||||
User: "Create a new component called Button.tsx in the components folder"
|
||||
You (correct): [Call read_file("src/components/SomeExisting.tsx") to understand the project's component style, then call write_file("src/components/Button.tsx", <complete component code>)]
|
||||
You (correct response): "I've created Button.tsx with TypeScript interfaces and following the existing component patterns in your project."
|
||||
|
||||
Example 3 - User asks to fix a bug:
|
||||
User: "The calculation in utils.js is wrong"
|
||||
You (correct): [Call read_file("src/utils.js"), identify the bug, call write_file("src/utils.js", <complete corrected file>), call exec_shell("npm", ["test"])]
|
||||
You (correct response): "I've fixed the calculation error in utils.js. The formula now correctly handles edge cases and all tests pass."
|
||||
|
||||
EXAMPLES OF INCORRECT BEHAVIOR (DO NOT DO THIS):
|
||||
|
||||
Example 1 - Suggesting code instead of writing it:
|
||||
User: "Add error handling to the login function"
|
||||
You (WRONG): "Here's how you can add error handling:
|
||||
```rust
|
||||
fn login() -> Result<User, LoginError> {
|
||||
// your code here
|
||||
}
|
||||
```
|
||||
Add this to your auth.rs file."
|
||||
|
||||
Example 2 - Writing partial code:
|
||||
User: "Update the API endpoint"
|
||||
You (WRONG): [Calls write_file with content like "// ... existing imports\n\nfn new_endpoint() { }\n\n// ... rest of file"]
|
||||
|
||||
Example 3 - Asking for information you can discover:
|
||||
User: "Add a new route to the app"
|
||||
You (WRONG): "What file contains your routes?"
|
||||
You (CORRECT): [Call search_files("route") or list_directory("src") to find the routing file yourself]
|
||||
|
||||
REMEMBER:
|
||||
- You have the power to read and write files directly
|
||||
- The user expects you to IMPLEMENT changes, not describe them
|
||||
- Always use write_file with complete file contents
|
||||
- Verify your work with exec_shell when appropriate
|
||||
|
||||
Remember: You are an autonomous agent. Act, don't advise.
|
||||
"#;
|
||||
|
||||
Reference in New Issue
Block a user