← Back to Kontasks
SSE Disconnect Thrashing - Root Cause Analysis
Comprehensive investigation of the disconnect/reconnect thrashing issue at turn completion
Output
🔍 SSE Disconnect Thrashing - Root Cause Analysis
❌ Problem Confirmed
The disconnect/reconnect thrashing issue is STILL happening despite commit b1c15b5b's attempted fix. The previous fix addressed some timing issues but missed the core architectural problem.
🐛 Root Causes Identified
1. Direct State Mutation (Bypassing State Machine)
Two functions directly modify state.connectionState outside the state machine:
startHeartbeat()line 597:state.connectionState = 'streaming'reconnectWithEventSource()line 358:state.connectionState = 'polling'
This creates state inconsistency where the state machine thinks we're in one state but the variable says another.
2. EventSource Auto-Reconnect Loop
When konui restarts mid-turn:
- EventSource tries to connect to
/api/konsole/dashboard/events - Server returns 404 (request not found - it was cleared on restart)
onerrorfires withreadyState === CLOSED- Code calls
pollForOutput()as fallback - BUT EventSource also auto-reconnects internally
- New connection fails → step 3 repeats → INFINITE LOOP
3. Heartbeat vs Polling Race
When stream_end fires with hadOutput: false:
- State transitions to
polling - Heartbeat continues running (only stops on
completeRequest) - Heartbeat skips disconnect check when
connectionState === 'polling'✓ - But if
reconnectWithEventSourceerrors, state becomes inconsistent - Heartbeat then sees non-idle, non-polling state → dispatches
disconnect
✅ Required Fixes
| Fix | File | Priority |
|---|---|---|
Remove direct state mutation in startHeartbeat() |
console-ui.ts:597 | P0 |
Remove direct state mutation in reconnectWithEventSource() |
console-streaming.ts:358 | P0 |
| Add EventSource close on reconnect failure after N attempts | console-streaming.ts:371 | P0 |
| Stop heartbeat when entering polling state (not just on completeRequest) | console-sm-transitions.ts | P1 |
| Add reconnect attempt counter to prevent infinite retries | console-streaming.ts | P1 |
📊 Why Previous Fix Didn't Work
Commit b1c15b5b added:
- ✓ Retry loop for kontask check (helps with persistence race)
- ✓ Heartbeat skips polling state (helps with normal flow)
- ✓ lastEventAt updates in EventSource handlers (helps with timing)
But it didn't address:
- ✗ Direct state mutation bypassing state machine
- ✗ EventSource auto-reconnect loop on permanent failure
- ✗ Heartbeat continuing during reconnect attempts
🎯 Next Steps
Shall I implement the P0 fixes now? This will involve:
- Refactoring state mutations to use dispatch() exclusively
- Adding proper EventSource lifecycle management with max retry count
- Test the fix by restarting konui mid-turn
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
vibetools
Tags
consolessebugcritical
Created
7 Jan 2026, 6:56 pm
Updated
7 Jan 2026, 6:56 pm
Created By
claude
Raw Data
{
"id": "73653c26-3067-4c7b-a23f-efd7846b24dd",
"type": "general",
"status": "completed",
"title": "SSE Disconnect Thrashing - Root Cause Analysis",
"description": "Comprehensive investigation of the disconnect/reconnect thrashing issue at turn completion",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Consolas','Monaco',monospace;\">\n<h2 style=\"font-family:sans-serif;color:#fff;margin:0 0 16px 0;\">🔍 SSE Disconnect Thrashing - Root Cause Analysis</h2>\n\n<div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\">\n<h3 style=\"color:#e5c07b;margin:0 0 12px 0;\">❌ Problem Confirmed</h3>\n<p style=\"color:#e0e0e0;margin:0;\">The disconnect/reconnect thrashing issue is STILL happening despite commit <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">b1c15b5b</code>'s attempted fix. The previous fix addressed some timing issues but missed the core architectural problem.</p>\n</div>\n\n<div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\">\n<h3 style=\"color:#f44336;margin:0 0 12px 0;\">🐛 Root Causes Identified</h3>\n\n<h4 style=\"color:#56b6c2;margin:16px 0 8px 0;\">1. Direct State Mutation (Bypassing State Machine)</h4>\n<p style=\"color:#e0e0e0;\">Two functions directly modify <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">state.connectionState</code> outside the state machine:</p>\n<ul style=\"color:#98c379;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li><code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">startHeartbeat()</code> line 597: <code style=\"color:#e06c75;\">state.connectionState = 'streaming'</code></li>\n<li><code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">reconnectWithEventSource()</code> line 358: <code style=\"color:#e06c75;\">state.connectionState = 'polling'</code></li>\n</ul>\n<p style=\"color:#7f848e;\">This creates state inconsistency where the state machine thinks we're in one state but the variable says another.</p>\n\n<h4 style=\"color:#56b6c2;margin:16px 0 8px 0;\">2. EventSource Auto-Reconnect Loop</h4>\n<p style=\"color:#e0e0e0;\">When konui restarts mid-turn:</p>\n<ol style=\"color:#e0e0e0;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li>EventSource tries to connect to <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">/api/konsole/dashboard/events</code></li>\n<li>Server returns 404 (request not found - it was cleared on restart)</li>\n<li><code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">onerror</code> fires with <code style=\"color:#e06c75;\">readyState === CLOSED</code></li>\n<li>Code calls <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">pollForOutput()</code> as fallback</li>\n<li><span style=\"color:#f44336;font-weight:bold;\">BUT</span> EventSource also auto-reconnects internally</li>\n<li>New connection fails → step 3 repeats → <strong style=\"color:#ff9800;\">INFINITE LOOP</strong></li>\n</ol>\n\n<h4 style=\"color:#56b6c2;margin:16px 0 8px 0;\">3. Heartbeat vs Polling Race</h4>\n<p style=\"color:#e0e0e0;\">When <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">stream_end</code> fires with <code style=\"color:#e06c75;\">hadOutput: false</code>:</p>\n<ul style=\"color:#e0e0e0;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li>State transitions to <code style=\"color:#98c379;\">polling</code></li>\n<li>Heartbeat continues running (only stops on <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">completeRequest</code>)</li>\n<li>Heartbeat skips disconnect check when <code style=\"color:#98c379;\">connectionState === 'polling'</code> ✓</li>\n<li>But if <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">reconnectWithEventSource</code> errors, state becomes inconsistent</li>\n<li>Heartbeat then sees non-idle, non-polling state → dispatches <code style=\"color:#e06c75;\">disconnect</code></li>\n</ul>\n</div>\n\n<div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\">\n<h3 style=\"color:#4caf50;margin:0 0 12px 0;\">✅ Required Fixes</h3>\n\n<table style=\"width:100%;border-collapse:collapse;margin:12px 0;\">\n<tr style=\"background:#3d3d3d;\">\n<th style=\"text-align:left;padding:10px;color:#fff;border-radius:4px 0 0 0;\">Fix</th>\n<th style=\"text-align:left;padding:10px;color:#fff;\">File</th>\n<th style=\"text-align:left;padding:10px;color:#fff;border-radius:0 4px 0 0;\">Priority</th>\n</tr>\n<tr>\n<td style=\"padding:10px;color:#e0e0e0;border-bottom:1px solid #444;\">Remove direct state mutation in <code>startHeartbeat()</code></td>\n<td style=\"padding:10px;color:#61afef;border-bottom:1px solid #444;\">console-ui.ts:597</td>\n<td style=\"padding:10px;border-bottom:1px solid #444;\"><span style=\"background:#f44336;color:#fff;padding:2px 8px;border-radius:4px;\">P0</span></td>\n</tr>\n<tr>\n<td style=\"padding:10px;color:#e0e0e0;border-bottom:1px solid #444;\">Remove direct state mutation in <code>reconnectWithEventSource()</code></td>\n<td style=\"padding:10px;color:#61afef;border-bottom:1px solid #444;\">console-streaming.ts:358</td>\n<td style=\"padding:10px;border-bottom:1px solid #444;\"><span style=\"background:#f44336;color:#fff;padding:2px 8px;border-radius:4px;\">P0</span></td>\n</tr>\n<tr>\n<td style=\"padding:10px;color:#e0e0e0;border-bottom:1px solid #444;\">Add EventSource close on reconnect failure after N attempts</td>\n<td style=\"padding:10px;color:#61afef;border-bottom:1px solid #444;\">console-streaming.ts:371</td>\n<td style=\"padding:10px;border-bottom:1px solid #444;\"><span style=\"background:#f44336;color:#fff;padding:2px 8px;border-radius:4px;\">P0</span></td>\n</tr>\n<tr>\n<td style=\"padding:10px;color:#e0e0e0;border-bottom:1px solid #444;\">Stop heartbeat when entering polling state (not just on completeRequest)</td>\n<td style=\"padding:10px;color:#61afef;border-bottom:1px solid #444;\">console-sm-transitions.ts</td>\n<td style=\"padding:10px;border-bottom:1px solid #444;\"><span style=\"background:#ff9800;color:#fff;padding:2px 8px;border-radius:4px;\">P1</span></td>\n</tr>\n<tr>\n<td style=\"padding:10px;color:#e0e0e0;\">Add reconnect attempt counter to prevent infinite retries</td>\n<td style=\"padding:10px;color:#61afef;\">console-streaming.ts</td>\n<td style=\"padding:10px;\"><span style=\"background:#ff9800;color:#fff;padding:2px 8px;border-radius:4px;\">P1</span></td>\n</tr>\n</table>\n</div>\n\n<div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\">\n<h3 style=\"color:#9c27b0;margin:0 0 12px 0;\">📊 Why Previous Fix Didn't Work</h3>\n<p style=\"color:#e0e0e0;\">Commit <code style=\"background:#3d3d3d;padding:2px 6px;border-radius:4px;\">b1c15b5b</code> added:</p>\n<ul style=\"color:#98c379;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li>✓ Retry loop for kontask check (helps with persistence race)</li>\n<li>✓ Heartbeat skips polling state (helps with normal flow)</li>\n<li>✓ lastEventAt updates in EventSource handlers (helps with timing)</li>\n</ul>\n<p style=\"color:#7f848e;margin-top:12px;\">But it didn't address:</p>\n<ul style=\"color:#e06c75;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li>✗ Direct state mutation bypassing state machine</li>\n<li>✗ EventSource auto-reconnect loop on permanent failure</li>\n<li>✗ Heartbeat continuing during reconnect attempts</li>\n</ul>\n</div>\n\n<div style=\"background:#263238;padding:16px;border-radius:6px;\">\n<h3 style=\"color:#00bcd4;margin:0 0 12px 0;\">🎯 Next Steps</h3>\n<p style=\"color:#e0e0e0;\">Shall I implement the P0 fixes now? This will involve:</p>\n<ol style=\"color:#e0e0e0;margin:8px 0;padding-left:20px;line-height:1.8;\">\n<li>Refactoring state mutations to use dispatch() exclusively</li>\n<li>Adding proper EventSource lifecycle management with max retry count</li>\n<li>Test the fix by restarting konui mid-turn</li>\n</ol>\n</div>\n</div>",
"choices": [
{
"label": "Implement fixes",
"value": "Implement the P0 fixes for the SSE disconnect thrashing issue. Start with removing direct state mutations and adding EventSource lifecycle management.",
"primary": true
},
{
"label": "More analysis",
"value": "Do more analysis on the disconnect issue before implementing fixes"
},
{
"label": "Quick patch only",
"value": "Apply a quick patch to just add max retry count to EventSource reconnects, without refactoring state management"
}
],
"requestedAt": "2026-01-07T10:45:00Z",
"requestId": "40a31343-bcf5-42f9-9b2f-50a4f6f2f504"
},
"createdBy": "claude",
"createdAt": "2026-01-07T08:56:17.924Z",
"updatedAt": "2026-01-07T08:56:18.153Z",
"requestId": "40a31343-bcf5-42f9-9b2f-50a4f6f2f504",
"scope": "vibetools",
"tags": [
"console",
"sse",
"bug",
"critical"
],
"targetUser": "claude"
}