This is an email that I sent some time ago (May 2000) to the SWEng-Gamedev mail list. I thought it was informative enough to put it here.
> Now, if I make a "void" export without calling my plugs of
> plug, it still crash. BUT, if I don't dynamically load them,
> everything suddenly works like a charm. That is, using
> LoadLibrary seems to make my program crash in a totally
> random way, even if I load the libraries without even
> using them afterwards.
One problem I had way back when I had the 3D engine implementations on DLLs was the windows message pump. Basically, I had a reference-counted class that was implemented inside the DLL, and it would subclass the app's window so that it could properly handle the windows messages that DirectDraw and Direct3D need. Now, the problem came when I tried to detect when the window was externally deleted. I would handle the WM_DELETE message and release the message pump's reference-count of the engine class. The problem came when that was the last reference (like on certain sequences of shut-down of the app). Then, the engine class's destructor would be called, which in turn called FreeLibrary (LoadLibrary and FreeLibrary work like a reference-counting API for DLLs, too), which is obviously bad. The DLL was removed while there was still code in it that was meant to run, so it crashed trying to run code that's not in memory any more.
I don't know if your problem is related, but it's worth a look.
> Never seen that before, and really deeply bored.
Come on! Don't you like the chase?
> But..... sometimes my program randomly crash. Big crash,
> MAX exits without any warning, over. It is quite random: say I
> have a given scene and I export it. It works. I export the same
> scene again. It works. And keeps working until the 30th, 40th or
> more export of the same scene. This is f*****g boring, that
> damn code once crashed after the 204th export of the same
> scene. Can't trace that.
:) Who says you can't?
I remember, a long time ago, in the times of DOS and even before that, most computers had a way to set the color of the area of the screen that lays outside of the frame buffer: the border color. One technique I used then was to change the color of that border in key points through my main loop so that, if my program crashes, the remaining border color after the crash would indicate the portion of my program that caused the crash. Especially useful to debug interrupt routines and such.
Can't do that any more. But there are still ways. And I think I found a particularly useful one.
Ok. Try this. I developed and used it successfully in tracking a nasty crash bug in Force Commander. It happened always after about one hour of play. Consistent. Wouldn't break into the debugger in any usable form. It took us two days of using the whole testing crew at Lucasarts to track it, but we did it.
The solution was to use the following little bit of code (sorry about the TABs, I haven't reformatted the code for email):
// InstrumentedTrace.h class CInstrumentTrace { #ifndef NO_INSTRUMENTATION // Private stuff. private: bool dataLinked; void *data; void LinkData(); // Public interface. public: __forceinline CInstrumentTrace(): dataLinked(false), data(0) {} // Public constructor. __forceinline ~CInstrumentTrace() {} // Public destructor. __forceinline void Set(int i, void *_data, int size) { if (!dataLinked) LinkData(); if (data) memcpy((int*)data + i, _data, size); } __forceinline void Set(int i, int val) { if (!dataLinked) LinkData(); if (data) ((int*)data)[i] = val; } __forceinline int Get(int i) { if (!dataLinked) LinkData(); if (data) return ((int*)data)[i]; else return 0; } #else public: __forceinline void Set(int i, void *data, int size) {} __forceinline void Set(int i, int val) {} __forceinline int Get(int i) { return 0; } #endif }; extern CInstrumentTrace instrumentTrace;
// InstrumentedTrace.cpp void CInstrumentTrace::LinkData() { dataLinked = true; bool created = false; HANDLE hm = OpenFileMapping(FILE_MAP_WRITE, TRUE, _T("InstrumentTracer")); if (hm == NULL) { HANDLE h = CreateFile(_T("c:\\ITracer.map"), GENERIC_WRITE | GENERIC_READ, FILE_SHARE_WRITE | FILE_SHARE_READ, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_TEMPORARY, NULL); if (h == (HANDLE)INVALID_HANDLE_VALUE) { return; } hm = CreateFileMapping(h, NULL, PAGE_READWRITE, 0, 4096, _T("InstrumentTracer")); if (hm == NULL) { return; } created = true; } data = MapViewOfFile(hm, FILE_MAP_WRITE, 0, 0, 0); if (data && created) { memset(data, 0, 4096); } } #endif CInstrumentTrace instrumentTrace;
This is, basically, how it works: First, it creates a memory-mapped file (always "C:\ITracer.map" in my case), 4096 bytes in size (one memory page). Then, it allows you to modify values within that page at will using very fast memory accesses, and then those changes will be flushed out to the file, even if the program crashes. It's like a customized, data-driven, "core dump" for Windows.
The way I used is inserting trace lines inside the code. Imagine a function I "suspect" of containing a crash bug:
void VeryUsefulFunction() { for (int i = 0; i < number; ++i) { DoSomethingUseful(); DoSomethingBuggy(); DoSomethingElse(); } }
I'd then flood the function with trace lines, like:
void VeryUsefulFunction() { instrumentedTrace.Set(32, 1); for (int i = 0; i < number; ++i) { instrumentedTrace.Set(32, 2); instrumentedTrace.Set(33, i); instrumentedTrace.Set(34, number); DoSomethingUseful(); instrumentedTrace.Set(32, 3); DoSomethingBuggy(); instrumentedTrace.Set(32, 4); DoSomethingElse(); instrumentedTrace.Set(32, 5); } instrumentedTrace.Set(32, 6); }
It looks ugly. But now, if DoSomethingBuggy() crashes, The file will contain, in its 32nd 32-bit position, the number 3, which indicates that DoSomethingBuggy() never returned. Then, I can go into that function, and add new trace lines, modifying, say, position number 64. Not only that, I also stored the loop counter in position 33 and the loop count in position 34, which might become useful clues, too.
For the curious, this is the code where the bug was found (WARNING: DirectX-SPECIFIC CODE FOLLOWS):
void CD3D7TextureManager::DeleteTexture(int idx) { RONIN_ASSERT(manager.IsIndexValid(idx)) return; if (manager.IsIndexValid(idx)) { TTextureData &texture = manager[idx]; if (texture.format != TEXF_NONE) { texture.lpDDS->Release(); texture.lpDDS = NULL; texture.format = TEXF_NONE; manager.DeleteElement(idx); } } }
This became:
void CD3D7TextureManager::DeleteTexture(int idx) { RONIN_ASSERT(manager.IsIndexValid(idx)) return; Ronin::instrumentTrace.Set(64, 3); if (manager.IsIndexValid(idx)) { Ronin::instrumentTrace.Set(64, 4); TTextureData &texture = manager[idx]; Ronin::instrumentTrace.Set(64, 5); if (texture.format != TEXF_NONE) { Ronin::instrumentTrace.Set(64, 6); Ronin::instrumentTrace.Set(68, (int) texture.format); Ronin::instrumentTrace.Set(69, (int) texture.lpDDS); #ifndef NO_INSTRUMENTATION if (texture.lpDDS) { Ronin::instrumentTrace.Set(64, 7); Ronin::instrumentTrace.Set(70, *(int*) texture.lpDDS); Ronin::instrumentTrace.Set(64, 8); Ronin::instrumentTrace.Set(512, &texture.tdesc, sizeof(texture.tdesc)); DDSURFACEDESC2 tdesc; memset(&tdesc, 0, sizeof(tdesc)); tdesc.dwSize = sizeof(tdesc); texture.lpDDS->GetSurfaceDesc(&tdesc); Ronin::instrumentTrace.Set(64, 9); Ronin::instrumentTrace.Set(768, &tdesc, sizeof(tdesc)); Ronin::instrumentTrace.Set(64, 10); } else { Ronin::instrumentTrace.Set(70, 0); } #endif texture.lpDDS->Release(); texture.lpDDS = NULL; texture.format = TEXF_NONE; Ronin::instrumentTrace.Set(64, 11); manager.DeleteElement(idx); } } Ronin::instrumentTrace.Set(64, 12); }
Position 64 turned out to be 10 whenever it crashed.
DirectX-SPECIFIC STUFF FOLLOWS: It was crashing inside of the IDirectDrawSurface7::Release() function. Not only that, the lpDDS (position 69) pointer was not NULL, its vtbl (position 70) was the proper one, and the call to GetSurfaceDesc() was always successful and returned the expected data (position 768) compared to the data it returned when creating the texture (position 512), which indicates that the texture is, indeed, valid at this point.
The problem seemed to be inside of DirectX. But, at least, knowing what it was allowed me to work around it.
It's a pain to track problems like this one, or like yours, that happen only after lengthy executions, but this makes it, at least, always productive in narrowing down the cause of the crash whenever the debugger can't.
It's also great in debugging interactions between threads and such. Just make each thread modify its own portion of the file, and you'll get information about what all threads were executing when the crash happened.
All trademarked things I mention here are TM by their respective owners. If you are one of those owners and want to be specifically mentioned, please, contact me and I'll include it.
Go back to the main index of JCAB's Rumblings
Wow!
hits and increasing...
To contact JCAB: jcab@JCABs-Rumblings.com
Last updated: [an error occurred while processing this directive]