Ok, I'm calm now. I got rid of critical sections (Mutexes for you non-Windows people) completely, and wrapped them in a CMutex class. The CMutex class has... Oh sod it, I'll just paste it here:
class CMutex{public: class Lock { public: Lock(CMutex& rMutex) : m_rMutex(rMutex) {rMutex.Enter();} ~Lock() {m_rMutex.Leave();} Lock(const Lock& rhs) : m_rMutex(rhs.m_rMutex) {Assert(false);} Lock& operator=(const Lock& rhs) {UNREFERENCED_PARAMETER(rhs); Assert(false); return *this;} private: CMutex& m_rMutex; };public: void Init() { InitializeCriticalSection(&m_cs);#ifdef _DEBUG m_dwHoldingThread = 0;#endif } void Delete() { DeleteCriticalSection(&m_cs); } void Enter() { EnterCriticalSection(&m_cs);#ifdef _DEBUG m_dwHoldingThread = GetCurrentThreadId();#endif } void Leave() {#ifdef _DEBUG Assert(GetCurrentThreadId() == m_dwHoldingThread); m_dwHoldingThread = 0;#endif LeaveCriticalSection(&m_cs); }private: CRITICAL_SECTION m_cs;#ifdef _DEBUG DWORD m_dwHoldingThread;#endif};
I'm not even going to explain that, you can read it and work it out, it's not hard. Anyway, I found out where some problems were. Apparently EnterCriticalSection lets you enter it more than once at a time, so long as it's from the same thread. So if, for instance my AcceptClient() function enters the critical section, does some stuff, then calls AllocateSocket(), which also enters the critical section, Windows doesn't bat an eyelid. "So what?" I hear you cry. Well, it seems to totally spaz out when you leave that critical section. If you enter it twice and leave it twice, it has a reference count of 1. Meaning the next thread that tries to enter it just sits there waiting for it to become zero.
So after crawling through about 10 asserts, I got that working. The [test] login server now seems to be working.
The game server however is what this entries title and the earlier "GAH!!!!" was about. Something is setting my socket to NULL. The socket is only set to anything in 4 places:
The interesting part is that when the login server times out, I call CSocket::Release(), then nullify it. And it's when I call Release() that I dereference the null pointer. HOW?!?!
The next fun thing is that when I set MSVC to watch the pointer and break to the debugger when it's changed, the code works. Excelent. It's almost certainly not stack corruption (I have very few arrays, most are STL containers, and I've checked everything about 5 times), so all I can think is the socket manager is screwing around with it somehow. I'll try killing all the socket manager threads once the login server connects and see if it still happens. If not, one of those threads is screwing around where it shouldn't be. But I'll do that later. I'm going to bed now.