Kibo.AgenticAdmin

Agent Testing

Overview

The Agent Testing interface provides a safe, controlled environment to test your agent’s responses, validate playbook configurations, and ensure tools are working correctly before deploying changes to production.

Accessing Agent Testing

Navigate to 🧪 Agent Testing in the sidebar when viewing any agent.

Agent Testing Interface

Testing Interface Components

Configuration Panel

The testing interface provides several configuration options:

Language Code
- Select the language for testing
- Dropdown with available languages
- Default: Agent’s primary language
Max Message Length
- Character limit for messages
- Default: 2000 characters
- Adjustable based on needs
Session Timeout
- Minutes before session expires
- Default: 30 minutes
- Prevents resource waste
Site Context
- Select brand/site context
- Affects agent behavior
- Matches production environments

Session Management

Session: test-session-[timestamp]-[random]
Status: active
Messages: 0
Duration: 0s
Language: en
Site: [selected site]

[New Session]

Session Features:

Unique session ID for tracking
Real-time status updates
Message counter
Duration tracking
One-click session reset

Message Interface

The testing chat interface includes:

Message Input Area
- Multi-line text support
- Shift+Enter for new lines
- Character counter (0/2000)
Send Button
- Enabled when message entered
- Disabled when empty
- Keyboard shortcut: Enter
Conversation Display
- Shows full conversation history
- User messages and agent responses
- Tool invocations and results

Testing Workflows

Basic Conversation Testing

Simple Query Test ``` You: Hello Agent: Hello! Welcome to [Brand]. How can I help you today?

You: I’m looking for shoes Agent: I’d be happy to help you find shoes. What type of shoes are you looking for?


2. **Intent Recognition Test**

You: I want to check my order status Agent: I can help you check your order status. Could you please provide your order number?

You: It’s ORD-12345 Agent: Let me look up that order for you… [Tool: Order Status Check]

### Tool Integration Testing

Test tool functionality within conversations:

You: Find me red running shoes under $100

Agent: I’ll search for red running shoes under $100. 🔧 Tool Call: Product Search API Input: { “query”: “red running shoes”, “maxPrice”: 100, “color”: “red” }

✅ Tool Response: Product Search API Output: { “products”: […], “count”: 5 }

Agent: I found 5 red running shoes under $100. Here are the options…

### Playbook Flow Testing

Validate complex playbook logic:

1. **Test Main Path**
   - Follow expected user journey
   - Verify all steps execute
   - Check tool integrations

2. **Test Alternative Paths**
   - Try different user inputs
   - Verify conditional logic
   - Test error scenarios

3. **Test Edge Cases**
   - Unusual inputs
   - Missing information
   - System errors

### Multi-turn Conversation Testing

Test context retention across multiple turns:

Turn 1: User asks about products Turn 2: Agent shows options Turn 3: User asks for more details Turn 4: Agent remembers context and elaborates Turn 5: User makes selection Turn 6: Agent confirms and proceeds

## Testing Strategies

### Functional Testing

1. **Playbook Validation**
   - Test each playbook individually
   - Verify start playbook works
   - Check playbook transitions

2. **Tool Verification**
   - Test each tool in isolation
   - Verify parameter passing
   - Check error handling

3. **Integration Testing**
   - Test complete workflows
   - Verify tool chains
   - Check data flow

### Performance Testing

1. **Response Time**
   - Monitor agent response speed
   - Check tool execution time
   - Identify bottlenecks

2. **Concurrent Testing**
   - Multiple test sessions
   - Load simulation
   - Resource monitoring

3. **Stress Testing**
   - Long conversations
   - Rapid messages
   - Complex queries

### User Experience Testing

1. **Conversation Quality**
   - Natural language flow
   - Appropriate responses
   - Helpful guidance

2. **Error Handling**
   - Graceful failures
   - Clear error messages
   - Recovery options

3. **Context Management**
   - Information retention
   - Relevant responses
   - Logical flow

## Test Scenarios

### Common Test Cases

1. **Greeting and Introduction**

Test: “Hello” Expected: Warm greeting with capabilities mention


2. **Product Search**

Test: “Show me blue shirts” Expected: Tool call to search, display results


3. **Order Management**

Test: “Check order 12345” Expected: Order lookup, status display


4. **Error Handling**

Test: “Check order INVALID” Expected: Polite error message, guidance

### Advanced Test Cases

1. **Multi-Intent Handling**

Test: “I want to buy shoes and check my order” Expected: Handle both intents appropriately


2. **Context Switching**

Test: Start with product search, switch to support Expected: Smooth transition between contexts


3. **Clarification Requests**

Test: Provide ambiguous input Expected: Agent asks clarifying questions ```

Best Practices

Testing Checklist

Before deploying changes:

Documentation

Document your tests:

Test Plan
- Scenarios to test
- Expected outcomes
- Success criteria
Test Results
- Actual outcomes
- Issues found
- Resolution steps
Regression Tests
- Key scenarios
- Automated where possible
- Regular execution

Iterative Testing

Start Simple
- Basic functionality first
- Add complexity gradually
- Build confidence
Test Often
- After each change
- Before deployments
- Regular regression
Learn from Production
- Replicate production issues
- Test fixes thoroughly
- Update test cases

Troubleshooting

Common Testing Issues

Session Timeouts
- Extend timeout setting
- Start new session
- Check for inactivity
Tool Failures
- Verify tool configuration
- Check test environment
- Review permissions
Unexpected Responses
- Check active playbooks
- Verify start playbook
- Review recent changes

Debug Techniques

Verbose Mode
- Enable detailed logging
- Review execution flow
- Identify failure points
Isolated Testing
- Test components separately
- Simplify scenarios
- Build up complexity
Comparison Testing
- Test before/after changes
- Compare environments
- Identify differences

Advanced Features

Test Automation

Scripted Tests
- Automated test sequences
- Repeatable scenarios
- Regression testing
Test Data Management
- Test data sets
- Environment variables
- Mock responses
Performance Monitoring
- Response time tracking
- Resource utilization
- Bottleneck identification

Environment Management

Test Environments
- Isolated testing spaces
- Production-like setup
- Safe experimentation
Configuration Profiles
- Different test configs
- Quick switching
- Scenario-based setup
Data Isolation
- Separate test data
- No production impact
- Clean test state

Integration with Development

CI/CD Integration

Automated testing in pipelines
Pre-deployment validation
Post-deployment verification

Version Control

Test case versioning
Configuration tracking
Change history

Collaboration

Shared test results
Team testing sessions
Knowledge sharing

Reporting

Test Reports

Generate reports showing:

Test execution summary
Pass/fail rates
Performance metrics
Issue tracking

Analytics

Testing trends
Common failures
Improvement areas
Success metrics

Continuous Improvement

Regular review cycles
Test optimization
Coverage expansion
Quality metrics