Skip to content

Troubleshooting Guide

Common issues and solutions for the Output Drift framework.

Installation Issues

Python Version Mismatch

Problem: ImportError or syntax errors

Solution:

# Check Python version (must be 3.11+)
python --version

# Use specific version if multiple installed
python3.11 -m venv venv
source venv/bin/activate

Dependency Conflicts

Problem: Package version conflicts

Solution:

# Fresh virtual environment
rm -rf venv
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt


Ollama Issues

Connection Refused

Problem: ConnectionRefusedError: [Errno 61] Connection refused

Solution:

# Check if Ollama is running
curl http://localhost:11434/api/tags

# If not running, start it
ollama serve

# Pull model if missing
ollama pull qwen2.5:7b-instruct

Model Not Found

Problem: Model 'qwen2.5:7b-instruct' not found

Solution:

# List available models
ollama list

# Pull the model
ollama pull qwen2.5:7b-instruct

# Verify
ollama list | grep qwen

Slow Performance

Problem: Ollama responses taking >10 seconds

Solution:

# Check system resources
top

# Ensure GPU is being used (if available)
ollama show qwen2.5:7b-instruct | grep parameters

# Reduce concurrency if CPU-only
--concurrency 4  # instead of 16


watsonx.ai Issues

Authentication Failed

Problem: 401 Unauthorized or Invalid API key

Solution:

# Test credentials
import os
from dotenv import load_dotenv
load_dotenv()

print(f"API Key: {os.getenv('WATSONX_API_KEY')[:10]}...")
print(f"Project ID: {os.getenv('WATSONX_PROJECT_ID')}")
print(f"URL: {os.getenv('WATSONX_URL')}")

# Verify all are set
assert all([os.getenv('WATSONX_API_KEY'),
            os.getenv('WATSONX_PROJECT_ID'),
            os.getenv('WATSONX_URL')]), "Missing credentials"

Check: 1. .env file exists in repository root 2. No extra spaces or quotes in .env 3. API key has not expired 4. Project ID is correct

Rate Limiting

Problem: 429 Too Many Requests

Solution:

# Add delay between requests
import time

for i in range(16):
    response = model.generate_text(prompt, params)
    time.sleep(1)  # 1 second delay

The framework includes built-in retry with exponential backoff for cloud providers (Anthropic, Gemini, Watsonx). Transient errors (429, 500-504) are automatically retried up to 3 times. To reduce load, lower the concurrency:

python run_evaluation.py --concurrency 1  # Sequential requests


Database Issues

SQLite Not Found

Problem: sqlite3.OperationalError: no such table: transactions

Solution:

# Regenerate database
python data/generate_toy_finance.py

# Verify tables
sqlite3 data/toy_finance.sqlite "SELECT name FROM sqlite_master WHERE type='table';"

Schema Mismatch

Problem: SQL queries fail with schema errors

Solution:

# Check schema
sqlite3 data/toy_finance.sqlite ".schema transactions"

# Expected output:
# CREATE TABLE transactions(
#   id INTEGER PRIMARY KEY,
#   date TEXT,
#   region TEXT,
#   amount REAL,
#   category TEXT
# );


Drift Detection Issues

False Positives (Detecting Drift When There Isn't Any)

Problem: Tier 1 models showing <100% consistency

Causes: 1. Non-deterministic seed: Ensure seed=42 is set 2. Different model versions: Check ollama show model 3. System prompt variations: Use exact prompts from templates 4. Whitespace differences: Normalize before comparison

Solution:

# Normalize outputs before comparison
import re

def normalize(text: str) -> str:
    """Remove extra whitespace and normalize case."""
    return re.sub(r'\s+', ' ', text.strip().lower())

output1_norm = normalize(output1)
output2_norm = normalize(output2)
match = (output1_norm == output2_norm)

False Negatives (Not Detecting Real Drift)

Problem: Drift exists but isn't detected

Causes: 1. n too small: Must use n≥16 for reliable detection 2. Hash collisions: Unlikely but possible with SHA-256 3. Semantic drift not captured: Same tokens, different meaning

Solution:

# Always use n=16 (as in paper)
python run_evaluation.py --concurrency 16 --repeats 16


API Errors

OpenAI Client Issues

Problem: openai.APIError or similar

Solution:

# For Ollama, use OpenAI client with custom base_url
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",  # Note: /v1 suffix
    api_key="ollama"  # Ollama doesn't check API key
)

Import Errors

Problem: ModuleNotFoundError: No module named 'harness'

Solution:

# Ensure you're in the repository root
cd /path/to/output-drift-financial-llms

# Activate virtual environment
source venv/bin/activate

# Verify installation
python -c "import harness; print('✅ Import successful')"


Performance Issues

Memory Errors

Problem: MemoryError or system freezing

Solution:

# Reduce concurrency
--concurrency 4  # Instead of 16

# Process in batches
# Use smaller model for faster runs
python run_evaluation.py --models qwen2.5:7b-instruct --concurrency 1

Slow Experiments

Problem: Experiments taking >1 hour

Solution:

# Profile time per request
time curl -X POST http://localhost:11434/api/generate \
  -d '{"model": "qwen2.5:7b-instruct", "prompt": "test"}'

# If >5 seconds per request:
# 1. Check CPU/GPU utilization
# 2. Reduce model size
# 3. Increase system resources


Cross-Provider Issues

Outputs Don't Match

Problem: Ollama and watsonx outputs differ significantly

Expected: Tier 1 → Tier 1 should match ≥95%

Debugging:

# Compare raw outputs
print("Ollama:", repr(ollama_output))
print("watsonx:", repr(watsonx_output))

# Check lengths
print(f"Ollama length: {len(ollama_output)}")
print(f"watsonx length: {len(watsonx_output)}")

# Calculate similarity
from rapidfuzz.distance import Levenshtein
sim = 1.0 - Levenshtein.normalized_distance(ollama_output, watsonx_output)
print(f"Similarity: {sim:.1%}")

Common causes: 1. Different model versions (Qwen vs Granite) 2. Temperature not exactly 0.0 3. System prompts differ 4. Different tokenization


Compliance Validation Errors

Schema Validation Fails

Problem: jsonschema.ValidationError

Solution:

import json
from jsonschema import validate, ValidationError

# Test JSON parsing
try:
    data = json.loads(response)
    print("✅ Valid JSON")
except json.JSONDecodeError as e:
    print(f"❌ Invalid JSON: {e}")

# Test schema validation
try:
    validate(data, schema)
    print("✅ Schema valid")
except ValidationError as e:
    print(f"❌ Schema invalid: {e.message}")


Common Error Messages

FileNotFoundError: [Errno 2] No such file or directory: 'traces/'

Solution:

mkdir -p traces

RuntimeError: Found no NVIDIA driver on your system

Not an error - Ollama will use CPU, which is fine for 7-20B models.

ImportError: cannot import name 'DeterministicRetriever'

Solution:

# Ensure harness/__init__.py exists
ls harness/__init__.py

# Reinstall if missing
pip install -e .


Getting Help

If you're still stuck:

  1. Check Logs:

    # Ollama logs
    ollama logs
    
    # Python logs
    python run_evaluation.py --verbose
    

  2. GitHub Issues: Open an issue

  3. Email Support: Contact maintainers (see README)

  4. Review Documentation:

  5. API Reference
  6. Lab Guides
  7. Research Paper

Quick Diagnostics Script

Run this to check your environment:

#!/usr/bin/env python3
"""Quick diagnostics for Output Drift framework."""
import sys
import os
import subprocess

print("🔍 Diagnostics\n" + "=" * 60)

# Python version
print(f"Python: {sys.version}")
assert sys.version_info >= (3, 11), "❌ Python 3.11+ required"
print("✅ Python version OK\n")

# Dependencies
try:
    import openai, pandas, matplotlib
    print("✅ Dependencies installed\n")
except ImportError as e:
    print(f"❌ Missing dependency: {e}\n")

# Ollama
try:
    result = subprocess.run(["curl", "-s", "http://localhost:11434/api/tags"],
                          capture_output=True, timeout=5)
    if result.returncode == 0:
        print("✅ Ollama running\n")
    else:
        print("⚠️  Ollama not responding\n")
except Exception as e:
    print(f"❌ Ollama check failed: {e}\n")

# Environment variables
env_vars = ["WATSONX_API_KEY", "WATSONX_PROJECT_ID"]
for var in env_vars:
    if os.getenv(var):
        print(f"✅ {var} set")
    else:
        print(f"⚠️  {var} not set")

print("\n" + "=" * 60)
print("Diagnostics complete!")

Save as diagnostics.py and run:

python diagnostics.py