When I was preparing last dirtyJOE update I’ve noticed that under some circumstances python DLLs are not freed from memory. What was even more interesting, this behaviour was occurring only in ready to release version of application. I’ve tested few scenarios and I figured out that the problem lays in UPX loader.
I’ll try to explain what exactly happens.
Prerequisites:
- there are two DLLs (preinstalled.dll, mylib.dll) and main executable (test.exe)
- import table of mylib.dll contains references to preinstalled.dll
- test.exe dynamically loads both libs
Scenario:
- test.exe tries to load preinstalled.dll (to check if it was installed in the system)
- if it succeed LoadCount field in _LDR_DATA_TABLE_ENTRY will be incremented (LoadCount = 1)
- test.exe now can try to load mylib.dll (without successful step 1, system will show ugly MessageBox about lacking of preinstalled.dll)
- if everything is ok LoadCount of preinstalled.dll should be incremented (LoadCount = 2)
- (…)
- free mylib.dll, LoadCount of preinstalled.dll should be decremented (LoadCount = 1)
- free preinstalled.dll, LoadCount decrementation (LoadCount = 0)
- system unmaps preinstalled.dll from application memory
Pseudocode:
HMODULE hPreInst = LoadLibrary("preinstalled.dll"); if (0 == hPreInst) { return; } //LoadCount = 1 HMODULE hMyLib = LoadLibrary("mylib.dll"); if (0 == hMyLib) { FreeLibrary(hPreInst); return; } //LoadCount = 2 //do some stuff here, it doesn't matter what ;) FreeLibrary(hMyLib); //LoadCount = 1 FreeLibrary(hPreInst); //LoadCount = 0, library is unmaped from the memory
In above situation everything is clear and works perfectly until someone will not pack mylib.dll with UPX (or any other packer with similar imports handling). For stability(?) and compatibility(?) reasons UPX keeps one imported function from every referenced library (except kernel32):
Original imports table | UPX imports table |
Kernel32.dll
|
Kernel32.dll
|
mylib.dll
|
mylib.dll
|
xxxx.dll
|
xxxx.dll
|
UPX loader is responsible for filling proper addresses in IAT:
lea edi, [esi+10000h] _next_library: mov eax, [edi] or eax, eax jz short _imports_end mov ebx, [edi+4] lea eax, [eax+esi+121B8h] add ebx, esi push eax add edi, 8 call dword ptr [esi+121F4h] ; LoadLibraryA xchg eax, ebp _next_function: mov al, [edi] inc edi or al, al jz short _next_library mov ecx, edi push edi dec eax repne scasb push ebp call dword ptr [esi+121F8h] ; GetProcAddress or eax, eax jz short _gpa_error mov [ebx], eax add ebx, 4 jmp short _next_function _gpa_error: popa xor eax, eax retn 0Ch _imports_end:
Let’s back to previous pseudocode:
HMODULE hPreInst = LoadLibrary("preinstalled.dll"); if (0 == hPreInst) { return; } //LoadCount = 1 HMODULE hMyLib = LoadLibrary("mylib.dll"); //mylib.dll is packed by UPX now !!! /* Actions taken behind our back: - loading UPX imports table from preinstalled.dll by windows loader, LoadCount = 2 - loading original imports table from preinstalled.dll by UPX loader, LoadCount = 3 */ if (0 == hMyLib) { FreeLibrary(hPreInst); return; } //LoadCount = 3 !!! //do some stuff here, it doesn't matter what ;) FreeLibrary(hMyLib); //LoadCount = 2 FreeLibrary(hPreInst); //LoadCount = 1, library will stay in memory !!!
Now, there is one simple question (or maybe not that simple ?): why don’t use GetModuleHandleA instead of LoadLibraryA ?
Because nobody cares :), UPX is like this for years, btw. the similar thing is with the UXTHEME.dll system library, load it once and call a few of its APIs and suddenly you’re going to end up with reference counter set to 3 or smth so you have to call FreeLibrary in the loop to free its memory.
PS. And UPX (and other packers) keeps static imports because of the TLS handling bug in Windows, TLS index isn’t allocated for DLL libraries that aren’t statically linked to the executable file.
[…] kompresora UPX, powodujący sztuczne zawyżanie reference counter-a dla ładowanych bibliotek:UPX “accidentally” increments LoadCount for DLLsWykorzystanie w edytorze dirtyJOE skryptów w Pythonie do odszyfrowania ukrytych stringów przez […]
This is probably NOT the case with UPX, but some crappy packers do the same thing out of sheer ignorance of their authors. Apparently GetModuleHandle is a rather obscure API compared to LoadLibrary, and most people don’t care about keeping DLLs in memory anyway (99% of the time the DLL is supposed to be in memory for the whole run of a program, loading and unloading dynamic libraries in Windows is quite slow).
The TLS ‘bug’ is a side-effect of forcing kernel32.dll to be loaded during process startup. Previously, the dynamic loading of kernel32.dll would allow any app to use TLS if it loaded something only from kernel32.dll. Now with kernel32.dll loaded already, TLS callbacks aren’t called in that case. You have to import from another DLL instead (doesn’t matter which one).
TLS is called for dynamically-loaded DLLs, too (even kernel32.dll or ntdll.dll), on Vista and later, despite what the docs say.
It’s all in my Anti-Unpacker Tricks papers. :-)
Dear Peter, I wrote about TLS *index*, NOT about issues with TLS Callbacks! Please read carefully what someone writes next time… It’s all in MS KB http://support.microsoft.com/kb/118816/ :)
Bartosz, problems with TLS callbacks or TLS index have the same cause.
It’s simply not supported for dynamic loading prior to Vista.
Please read carefully what I wrote – TLS is called for dynamically-loaded DLLs on Vista.
So on Vista, you’ll see that the index is allocated.
I asked Laszlo about it. It’s nothing to do with TLS (and we should have guessed that because UPX didn’t even support TLS for a long time). It’s because they had trouble with import tables that were mostly or completely empty.