# Two-Pass Mode Quick Reference

## Common Issue: Phase 2 (Offline) Results Empty ❌

### Why This Happens

In Two-Pass mode, you may get **online results** but **no offline results**. This occurs when:

- ❌ Speech segment is incomplete (no end detected)
- ❌ Session not properly ended
- ❌ Insufficient silence at end of speech
- ❌ VAD cannot determine speech boundaries

## Quick Fixes ✅

### 1. File Recognition (Easiest)

```python
# ✅ WORKS: Auto-handles everything
result = await client.recognize_file("audio.wav")
print(f"Online: {result.text}")  # Phase 1
# Phase 2 results sent via callback
```

### 2. Streaming Recognition

```python
# ✅ WORKS: Properly end each utterance
session = await client.start_realtime(callback)

# ... send audio chunks ...

# 🔑 KEY: End session after each utterance
await client.end_realtime_session(session)
```

### 3. Silence Detection Pattern

```python
# ✅ WORKS: Detect silence, restart session
SILENCE_DURATION = 1.5  # seconds

while recording:
    audio = get_audio_chunk()
    
    if is_silent(audio):
        if silent_duration > SILENCE_DURATION:
            # End current utterance
            await client.end_realtime_session(session)
            await asyncio.sleep(0.5)
            
            # Start new utterance
            session = await client.start_realtime(callback)
    else:
        await session.send_audio(audio)
```

## Configuration Checklist ✅

```python
client = AsyncFunASRClient(
    mode="2pass",           # ✅ Enable two-pass
    enable_vad=True,        # ✅ Enable VAD for boundary detection
    chunk_interval=10,      # ✅ 10ms recommended
    enable_itn=True,        # ✅ Text normalization
    enable_punctuation=True # ✅ Auto punctuation
)
```

## DO vs DON'T

### ✅ DO

```python
# ✅ File processing
result = await client.recognize_file("audio.wav")

# ✅ Utterance-based streaming
session = await client.start_realtime(callback)
# ... process one utterance ...
await client.end_realtime_session(session)

# ✅ Silence-based segmentation
if silence_detected():
    await client.end_realtime_session(session)
    session = await client.start_realtime(callback)
```

### ❌ DON'T

```python
# ❌ Continuous streaming without ending
session = await client.start_realtime(callback)
while True:
    await session.send_audio(audio)
    # Never ends - Phase 2 never triggers!

# ❌ Close connection without ending session
session = await client.start_realtime(callback)
await session.send_audio(audio)
await client.close()  # Missing end_realtime_session()

# ❌ Too-small chunks without VAD
client = AsyncFunASRClient(
    chunk_interval=1,       # Too small
    enable_vad=False        # Can't detect boundaries
)
```

## Debugging Checklist

When Phase 2 results are empty:

- [ ] Is `enable_vad=True`?
- [ ] Are you calling `end_realtime_session()`?
- [ ] Is there 0.5-1s silence at speech end?
- [ ] Is audio duration > 0.5s?
- [ ] Are you using `recognize_file()` for files?

## Result Flow

```
Audio Stream → VAD Detection → Phase 1 (Online) → Phase 2 (Offline)
                    ↓              ↓                    ↓
              Speech Boundary   Fast Result      High-Accuracy Result
                    ↓              ↓                    ↓
              [start, end]    "你好世界"          "你好,世界。"
```

**Phase 2 triggers when:**
- ✅ VAD detects complete segment `[start, end]`
- ✅ `end_realtime_session()` called
- ✅ `is_speaking=false` sent to server

## Example: Complete Working Code

```python
import asyncio
from funasr_client import AsyncFunASRClient, SimpleCallback

async def proper_two_pass():
    """✅ Correct two-pass usage"""
    
    # Callbacks for both phases
    def on_partial(result):
        print(f"[Phase 1] {result.text}")
    
    def on_final(result):
        print(f"[Phase 2] {result.text}")  # Will have results!
    
    callback = SimpleCallback(
        on_partial=on_partial,
        on_final=on_final
    )
    
    # Create client
    client = AsyncFunASRClient(
        mode="2pass",
        enable_vad=True
    )
    
    try:
        await client.start()
        
        # Method 1: File (Recommended for files)
        result = await client.recognize_file("audio.wav", callback)
        print(f"Final: {result.text}")
        
        # Method 2: Streaming (For real-time)
        session = await client.start_realtime(callback)
        
        # Simulate sending one utterance
        for chunk in audio_chunks:
            await session.send_audio(chunk)
        
        # 🔑 KEY: End the session
        await client.end_realtime_session(session)
        
    finally:
        await client.close()

asyncio.run(proper_two_pass())
```

## Server Configuration (Optional)

Adjust VAD sensitivity on server side:

```yaml
# config.yaml
vad:
  max_end_silence_time: 500    # Faster end detection
  speech_noise_thres: 0.6      # Lower threshold
```

## More Help

- 📖 [Detailed Best Practices Guide](TWO_PASS_BEST_PRACTICES_zh.md)
- 📖 [Examples Directory](../examples/)
- 📖 [Architecture Documentation](../../../runtime/docs/)
