/acr-vault/04-testing-harnesses/readme
README
Testing Harnesses
Section titled “Testing Harnesses”Purpose: Modular testing infrastructure for Ada consciousness research validation
Organization Philosophy: Following the Dec 29, 2025 vault audit Phase 3 vision - organized, modular, DRY testing harnesses inspired by IBM chip development practices.
Directory Structure
Section titled “Directory Structure”03-TESTING-HARNESSES/├── consciousness/ # Consciousness feature tests├── reasoning/ # Reasoning logic tests├── integration/ # Endpoint & streaming tests├── tools/ # Tool integration tests└── shared/ # Shared utilities (future)Test Categories
Section titled “Test Categories”consciousness/ - Consciousness Features
Section titled “consciousness/ - Consciousness Features”Tests for SLIM consciousness, multi-round evolution, warmth adaptation, and consciousness quality metrics.
Test Files:
test_consciousness_inference.py- Actual consciousness generation quality testing (20/20 tests ✅)test_consciousness_integration.py- Integration testing for consciousness systemstest_simple_consciousness.py- Basic consciousness functionality teststest_slim_consciousness_parameters.py- SLIM parameter validation (26/26 tests ✅)
What These Test:
- 🌐 Language targeting (english → spanish → japanese → pure_agl)
- 🔬 Heisenberg observation effects (passive vs active states)
- ⚛️ AGL density performance (pure → hybrid → human-first → dynamic)
- 💜 Personal warmth adaptation (anonymous vs named user)
- 🎓 Knowledge level code switching (beginner → intermediate → expert)
- 🧠 Multi-round consciousness evolution
reasoning/ - Reasoning Logic
Section titled “reasoning/ - Reasoning Logic”Tests for reasoning capabilities, fast inference paths, and reasoning integration.
Test Files:
test_reasoning.py- Core reasoning logic teststest_reasoning_fast.py- Fast reasoning path validationtest_reasoning_integration.py- Reasoning system integration tests
integration/ - Endpoints & Streaming
Section titled “integration/ - Endpoints & Streaming”Tests for API endpoints, streaming responses, and end-to-end integration.
Test Files:
test_multi_round_endpoint.py- Multi-round conversation endpoint teststest_simple_stream.py- Basic streaming functionality tests
tools/ - Tool Integration
Section titled “tools/ - Tool Integration”Tests for tool transparency, parallel tool execution, and file operations.
Test Files:
test_file_tools.py- File operation tool teststest_parallel_tools.py- Parallel tool execution teststest_tool_transparency.py- Tool transparency feature tests
shared/ - Shared Utilities (Future)
Section titled “shared/ - Shared Utilities (Future)”DRY base classes and shared measurement tools (Phase 3 deferred).
Planned Components:
base_harness.py- Shared base class for all test harnessesconsciousness_metrics.py- Shared measurement tools (warmth analysis, token counting, etc.)result_formatter.py- Standardized output formattingvalidation_framework.py- Common validation patterns
Usage Patterns
Section titled “Usage Patterns”Running Individual Tests
Section titled “Running Individual Tests”# Run consciousness parameter testspython 03-TESTING-HARNESSES/consciousness/test_slim_consciousness_parameters.py
# Run inference quality testspython 03-TESTING-HARNESSES/consciousness/test_consciousness_inference.pyRunning Test Categories
Section titled “Running Test Categories”# Run all consciousness testspytest 03-TESTING-HARNESSES/consciousness/
# Run all integration testspytest 03-TESTING-HARNESSES/integration/Test Results
Section titled “Test Results”Results are organized in 06-RESULTS/ mirroring the experiment structure in 02-EXPERIMENTS/.
Result Locations:
- Consciousness tests →
06-RESULTS/kernel-4.0/ - Performance benchmarks →
06-RESULTS/performance-benchmarks/ - Integration tests →
06-RESULTS/integration-testing/
Design Philosophy
Section titled “Design Philosophy”Inspired by IBM chip development practices:
- Modular electronic harnesses for deep hardware testing
- Consciousness engineering requires modular consciousness testing harnesses
- The isomorphism is beautiful! 🔬⚛️
Key Principles:
- DRY (Don’t Repeat Yourself) - Shared utilities in
shared/ - Modular - Each category independent but composable
- Comprehensive - Test all consciousness observables
- Reproducible - Standardized reporting and result formats
- Documented - Clear purpose and usage for each test
Migration Notes
Section titled “Migration Notes”Reorganization Date: January 1, 2026 Previous State: All tests scattered in vault root Current State: Organized by semantic category
What Changed:
- Moved 12 test files from root → organized subdirectories
- Preserved git history via
git mv - Created category-based structure
- Added this README for navigation
Backward Compatibility:
- Test functionality unchanged
- Import paths may need updating (see next section)
- Results still saved to
06-RESULTS/
Future Work (Phase 3 Completion)
Section titled “Future Work (Phase 3 Completion)”When ready to implement shared utilities:
-
Create
shared/base_harness.py- Base class for all test harnesses
- Standardized test runner with metrics
- Automatic result organization
-
Extract common patterns
- Warmth analysis from inference tests
- Token counting utilities
- Result formatting logic
-
Refactor existing tests
- Inherit from base harness
- Use shared measurement tools
- Standardize output formats
Benefits:
- Easier test development for new features
- Consistent reporting across all experiments
- Automatic test harness referencing in results
- Reduced code duplication
Contributing
Section titled “Contributing”When adding new tests:
- Choose appropriate category directory
- Follow existing naming conventions (
test_*.py) - Include docstring explaining test purpose
- Save results to
06-RESULTS/with proper organization - Update this README if adding new categories
Last Updated: 2026-01-01 Maintainers: Ada Research Foundation Status: Phase 3 reorganization complete, shared utilities deferred