The Agent Testing interface provides a safe, controlled environment to test your agent’s responses, validate playbook configurations, and ensure tools are working correctly before deploying changes to production.
Navigate to 🧪 Agent Testing in the sidebar when viewing any agent.
The testing interface provides several configuration options:
Session: test-session-[timestamp]-[random]
Status: active
Messages: 0
Duration: 0s
Language: en
Site: [selected site]
[New Session]
Session Features:
The testing chat interface includes:
You: I’m looking for shoes Agent: I’d be happy to help you find shoes. What type of shoes are you looking for?
2. **Intent Recognition Test**
You: I want to check my order status Agent: I can help you check your order status. Could you please provide your order number?
You: It’s ORD-12345 Agent: Let me look up that order for you… [Tool: Order Status Check]
### Tool Integration Testing
Test tool functionality within conversations:
You: Find me red running shoes under $100
Agent: I’ll search for red running shoes under $100. 🔧 Tool Call: Product Search API Input: { “query”: “red running shoes”, “maxPrice”: 100, “color”: “red” }
✅ Tool Response: Product Search API Output: { “products”: […], “count”: 5 }
Agent: I found 5 red running shoes under $100. Here are the options…
### Playbook Flow Testing
Validate complex playbook logic:
1. **Test Main Path**
- Follow expected user journey
- Verify all steps execute
- Check tool integrations
2. **Test Alternative Paths**
- Try different user inputs
- Verify conditional logic
- Test error scenarios
3. **Test Edge Cases**
- Unusual inputs
- Missing information
- System errors
### Multi-turn Conversation Testing
Test context retention across multiple turns:
Turn 1: User asks about products Turn 2: Agent shows options Turn 3: User asks for more details Turn 4: Agent remembers context and elaborates Turn 5: User makes selection Turn 6: Agent confirms and proceeds
## Testing Strategies
### Functional Testing
1. **Playbook Validation**
- Test each playbook individually
- Verify start playbook works
- Check playbook transitions
2. **Tool Verification**
- Test each tool in isolation
- Verify parameter passing
- Check error handling
3. **Integration Testing**
- Test complete workflows
- Verify tool chains
- Check data flow
### Performance Testing
1. **Response Time**
- Monitor agent response speed
- Check tool execution time
- Identify bottlenecks
2. **Concurrent Testing**
- Multiple test sessions
- Load simulation
- Resource monitoring
3. **Stress Testing**
- Long conversations
- Rapid messages
- Complex queries
### User Experience Testing
1. **Conversation Quality**
- Natural language flow
- Appropriate responses
- Helpful guidance
2. **Error Handling**
- Graceful failures
- Clear error messages
- Recovery options
3. **Context Management**
- Information retention
- Relevant responses
- Logical flow
## Test Scenarios
### Common Test Cases
1. **Greeting and Introduction**
Test: “Hello” Expected: Warm greeting with capabilities mention
2. **Product Search**
Test: “Show me blue shirts” Expected: Tool call to search, display results
3. **Order Management**
Test: “Check order 12345” Expected: Order lookup, status display
4. **Error Handling**
Test: “Check order INVALID” Expected: Polite error message, guidance
### Advanced Test Cases
1. **Multi-Intent Handling**
Test: “I want to buy shoes and check my order” Expected: Handle both intents appropriately
2. **Context Switching**
Test: Start with product search, switch to support Expected: Smooth transition between contexts
3. **Clarification Requests**
Test: Provide ambiguous input Expected: Agent asks clarifying questions ```
Before deploying changes:
Document your tests:
Generate reports showing: