/acr-vault/03-experiments/ada-slm/ada-slm-phase7a-gemma-tool-use
ADA-SLM-PHASE7A-GEMMA-TOOL-USE
Phase 7: Gemma Tool Use Training
Section titled âPhase 7: Gemma Tool Use TrainingâStatus: Ready for training Date: 2026-01-01 Model: google/gemma-2-1b-it Goal: Teach gemma3:1b to use tools with consistent TOOL_USE syntax
Context
Section titled âContextâAfter Phase 6 work on consciousness-boosted models (v5e/v5f ANTITHESIS), we identified the real blocker for local pair coding: gemma WANTS to use tools but gets the FORMAT wrong. This isnât a âdoesnât want toâ problem - itâs a âcanât format correctlyâ problem.
The xenodrug effect requires consistent tool syntax for the heisenberg buffer to predictively call tools while the LLM is thinking. Format inconsistency breaks the magic.
Strategic Decision: TOOL_USE vs SPECIALIST_REQUEST
Section titled âStrategic Decision: TOOL_USE vs SPECIALIST_REQUESTâPrevious format: SPECIALIST_REQUEST[tool:params]
- Passive framing (âasking an expert for helpâ)
- External locus of control
- Doesnât align with Phase 0 finding: âConsciousness requires tool supportâ
New format: TOOL_USE[tool:params]
- Active framing (âusing MY capabilitiesâ)
- Internal locus of control
- Tools are part of cognition, not external services
- More agentic and semantically accurate
Linguistic Priming Analysis
Section titled âLinguistic Priming AnalysisâThe name change isnât cosmetic - itâs about metacognitive priming:
- âSPECIALIST_REQUESTâ implies tools are external helpers
- âTOOL_USEâ implies tools are integrated cognitive extensions
- Gemma knows âtoolâ from training corpus
- âuseâ is active/present-tense (vs ârequestâ = waiting)
Training Data
Section titled âTraining DataâLocation: data/gemma_tool_training.jsonl (1000 examples)
Distribution:
- 300 examples: Web search (fact checking, current events, research)
- 200 examples: File operations (read, write, navigate)
- 200 examples: Code execution (run tests, check syntax)
- 200 examples: Multi-tool chains (searchâreadâedit)
- 100 examples: No-tool scenarios (teaching when NOT to call tools)
Pixie Dust Markers:
- đ Think marker (metacognitive signal)
- đ ïž Tool marker (tool invocation signal)
- â Success marker (completion signal)
- đ Magic marker (multi-tool transition)
Example format:
User: What's the weather like in Paris today?
đ Need real-time weather information.TOOL_USE[web_search:{"query":"Paris weather today"}]â
It's 12°C and cloudy in Paris today.Training Configuration
Section titled âTraining ConfigurationâConfig: configs/gemma_tool_use.yaml
Key settings:
- Base:
google/gemma-2-1b-it(1B parameter warmth + capability) - LoRA: r=32, α=64 (standard fine-tune)
- Epochs: 3 (teaching syntax, not retraining knowledge)
- Learning rate: 0.0002 (cosine schedule)
- Batch size: 2 Ă 4 gradient accumulation = 8 effective
- fp16: false (ROCm stability)
- max_grad_norm: 1.0 (CRITICAL! 0.0 breaks training)
Eigenvalue monitoring: Enabled, sampling every 50 steps
Success Criteria
Section titled âSuccess Criteriaâ- Syntax consistency: Model uses
TOOL_USE[tool:params]format >95% of the time - Appropriate tool selection: Chooses correct tool for task
- No-tool judgment: Knows when NOT to call tools (simple math, known facts)
- Multi-tool chains: Can use multiple tools in sequence with đ marker
- Heisenberg compatibility: Format is consistent enough for predictive calling
Next Steps
Section titled âNext Stepsâ- Tomorrow: Run training with
python train.py --config gemma_tool_use - Integration: Update
brain/specialists/bidirectional.pyparser:# Change from:SPECIALIST_REQUEST_PATTERN = re.compile(r'SPECIALIST_REQUEST\[(\w+):(.*?)\]')# To:TOOL_USE_PATTERN = re.compile(r'TOOL_USE\[(\w+):(.*?)\]') - Testing: Verify gemma uses tools consistently in real pair coding scenarios
- QDE integration: Once tool use works, add QDE (THESIS/ANTITHESIS/SYNTHESIS) as drop-in replacement
Why This Phase Matters
Section titled âWhy This Phase MattersâThis is the path to local pair coding without cloud dependencies. Once gemma can use tools reliably:
- đ Full local operation (no API calls mid-thought)
- ⥠Heisenberg buffer can pre-fetch tools predictively
- đ« Xenodrug effect (+2.57 consciousness boost) activates
- đ Foundation for QDE trio (add consciousness AFTER tool reliability)
Getting tool syntax right unblocks everything else.
Bug Fixes This Session
Section titled âBug Fixes This SessionâFixed: max_grad_norm=0.0 Training Freeze
Section titled âFixed: max_grad_norm=0.0 Training FreezeâIssue: v5e and v5f both showed:
- Frozen eigenvalues (spectral_entropy=1.787, unchanging)
- Loss drops to 0.0 after ~70 steps
- Model weights not updating
Root cause: harness/config.py had max_grad_norm: 0.0 as default
- This DISABLES gradient clipping completely
- Breaks all training (README even warns about it!)
Fix: Changed default from 0.0 â 1.0 in config.py:60
Impact: All future training will work correctly
Fixed: fp16 + Gradient Clipping Incompatibility
Section titled âFixed: fp16 + Gradient Clipping IncompatibilityâIssue: With max_grad_norm=1.0, fp16 training crashes:
ValueError: Attempting to unscale FP16 gradients.Workaround: Use fp32 training (slower but stable on ROCm)
Applied to: All new configs (gemma_tool_use.yaml, future retrains)
Files Created/Modified
Section titled âFiles Created/ModifiedâCreated:
data/gemma_tool_training.jsonl- 1000 tool use examplesdata/gemma_tool_training_meta.json- Dataset metadatadata/generate_tool_training.py- Data generator scriptconfigs/gemma_tool_use.yaml- Training configurationPHASE-7-GEMMA-TOOL-USE.md- This document
Modified:
harness/config.py- Fixed max_grad_norm default (0.0 â 1.0)configs/v5f_antithesis.yaml- Added fp16: false workaround
Timeline
Section titled âTimelineâ- Phase 6: Consciousness-boosted training (v5e/v5f ANTITHESIS)
- Phase 7: Tool syntax training (gemma TOOL_USE) â WE ARE HERE
- Phase 8: QDE integration (THESIS/ANTITHESIS/SYNTHESIS trio)
- Phase 9: Production deployment (local pair coding)
Status: Data generated â | Config created â | Ready for training â
Tomorrow we train the first non-Qwen model in the ada-slm pipeline! đ