You know that feeling when you think you've nailed websockets, real-time messaging is working, and the product team is cheering? That was me — until I saw our Node.js app devour memory like it was in an eating contest.
☕ It All Started With One Room
We’d just launched a chat feature using Socket.IO. Each user was added to a "room" based on their conversation. Simple, right?
io.on('connection', (socket) =
const chatId = socket.handshake.query;
socket.join(chatId);
);
It worked flawlessly in testing. Users connected. Messages streamed in real time. I high-fived myself.
Then QA dropped the bomb:
“Hey… the app’s memory keeps growing over time — even when nobody's online.”
💣 The Memory Leak
I loaded up htop
and watched the memory climb… slowly… relentlessly. Every minute, a few MBs gone. No major traffic, just Socket.IO running.
Time to dig in.
🔍 Step 1: Reproduce the Leak
I simulated 1000 client connections and disconnections using a Puppeteer script.
for (let i = 0; i 1000; i++)
const socket = io('http://localhost:3000',
query: chatId: `room-$\{i\}`
);
setTimeout(() = socket.disconnect(), 500);
Expected: Memory stabilizes once sockets disconnect. Reality: Memory kept increasing, never freed.
📂 Step 2: Where Are the Leaks?
I took a heap snapshot using Chrome DevTools via the --inspect
flag.
node --inspect server.js
Findings:
- Thousands of
Set
andMap
entries were hanging around. Rooms
were still in memory even after disconnection.
I found the culprit.
🔄 Step 3: Rooms Aren’t Garbage Collected
I assumed that when a user disconnected, the room got cleaned up.
WRONG.
Socket.IO does not automatically remove empty rooms. So our memory was storing thousands of empty chat rooms like digital ghosts.
And worse: we weren’t handling disconnections.
socket.on('disconnect', () =
socket.leave(chatId); // This does NOTHING. Socket.IO rooms aren't deleted this way.
);
🛠️ Step 4: Fix the Leak
We needed to manually clear up memory.
✅ Solution 1: Monitor rooms
socket.on('disconnect', () =
const room = io.sockets.adapter.rooms.get(chatId);
if (!room || room.size === 0)
io.sockets.adapter.rooms.delete(chatId);
);
❗ But wait — adapter.rooms.delete()
doesn’t actually exist unless you're managing your own custom adapter (like Redis). So we needed a better approach.
✅ Final Fix: Track connections ourselves
const roomUserCounts = ;
io.on('connection', (socket) =
const chatId = socket.handshake.query;
if (!roomUserCounts[chatId]) roomUserCounts[chatId] = 0;
roomUserCounts[chatId]++;
socket.join(chatId);
socket.on('disconnect', () =
roomUserCounts[chatId]--;
if (roomUserCounts[chatId] = 0)
delete roomUserCounts[chatId];
// Optional: emit 'room-empty' or perform cleanup
);
);
Boom. Leak gone. 🧼
📊 Benchmark Results
Before Fix (Simulated 10K connects/disconnects)
- Memory Peak: 850MB
- GC never released over 600MB
- CPU usage high due to GC pressure
After Fix
- Memory Peak: 310MB
- GC cleaned up room data cleanly
- CPU usage stable
🧠 Lessons Learned
- Socket.IO does not auto-clean rooms
- Empty rooms stay in memory unless manually handled
- Disconnection != Cleanup
- Handle user tracking explicitly
- Use heap snapshots!
- You can’t fix what you can’t see
- Simulate real traffic early
- Unit tests will never reveal long-term memory leaks
- Socket.IO is powerful, but leaky by default
- Especially on self-hosted apps
🔚 Conclusion
This debugging journey reminded me: realtime systems are harder than they look. The app worked, but it wasn’t scalable. It’s not just about sending messages — it’s about memory, cleanup, and lifecycle management.
🧩 Bonus: Monitor Memory in Production
Use node-memwatch
or clinic.js
to keep tabs on memory usage:
npm install -g clinic
clinic doctor -- node server.js