Core Testing Tools

SPOUT's core testing suite provides comprehensive validation of all core modules and model configurations. The testing framework includes both single-model and multi-model testing capabilities.

Core Test Runner

The core test runner validates all SPOUT modules against a single model:

# Windows
./test_core.bat

# Unix
./test_core.sh

Modules Tested

The test suite evaluates all core modules:

reduce
expand
enhance
search
mutate
generate
iterate
translate
converse
parse
evaluate
imagine

Test Process

For each module, the runner:

Records initial test state
Executes module tests
Captures all outputs
Calculates pass/fail rates
Generates detailed logs

Multi-Model Test Runner (Gamut)

The gamut test runner executes core tests across all active models:

# Windows
./test_gamut.bat

# Unix
./test_gamut.sh

Features

Reads active models from models.ini
Automatically switches between models
Tracks per-model performance
Calculates aggregate statistics
Generates comprehensive reports

Configuration

Models are configured in spout/config/models.ini:

gpt-3.5-turbo=1
gpt-4=1
claude-2=0  # Disabled model

Test Results

Core Test Output

Results are saved to tests/core_test-[timestamp].txt:

Test Results - 2024-03-21_14-30-22
===================

Overall Pass Rate: 95% (57/60)

Module Results:
reduce: 100% (5/5)
expand: 90% (9/10)
enhance: 95% (19/20)
...

Gamut Test Output

Results are saved to tests/gamut_summary-[timestamp].txt:

Model Performance Summary - 2024-03-21_14-30-22
=================================

Total Models Tested: 3
Total Duration: 0h 15m 45s
Average Duration: 5m 15s
Average Pass Rate: 92%

Models Tested Successfully:
- gpt-3.5-turbo (90%)
- gpt-4 (95%)
- claude-2 (91%)

Performance Metrics

Individual Tests

Each test captures:

Execution time
Response validity
Error handling
Token usage
Response formatting

Aggregate Metrics

The summary includes:

Overall pass rate
Per-module statistics
Total execution time
Average response time
Model comparisons

Best Practices

Regular Testing

Run core tests after updates
Test new models thoroughly
Monitor performance trends
Track error patterns
Document issues

Test Management

Archive test results
Review performance regularly
Compare model behaviors
Track long-term trends
Document anomalies

Error Analysis

Review failed tests
Check error patterns
Validate error handling
Monitor timeout rates
Track recovery behavior

Run the gamut test suite when:

Adding new models
Updating model configurations
Making significant changes
Validating deployments

Core tests are designed to validate both functional correctness and error handling. Failed tests may indicate issues with model configuration, API access, or core module functionality.

TestingPrompt Runners

TestingSample Generation