July 4, 2011November 25, 2016 by ReWolf

Mixing x86 with x64 code

papers, programming, reverse engineering, source code, WoW64, x64
99 Comments

Few months ago I was doing some small research about possibility of running native x64 code in 32-bits processes under the WoW64 layer. I was also checking it the other way round: run native x86 code inside 64-bits processes. Both things are possible and as far as I googled some people used it already:

Unfortunately I wasn’t aware of any of above results when I was doing my research, so I’ll just present my independent insights ;)

UPDATE:
All mentioned tricks (with necessary bugfixes and Windows 10 support) are currently part of the WoW64Ext library, that can be found on the github: https://github.com/rwfpl/rewolf-wow64ext

x86 <-> x64 Transition

The easiest method to check how x86 <-> x64 transition is made is to look at any syscall in the 32-bits version of ntdll.dll from x64 version of windows:

32-bits ntdll from Win7 x86	32-bits ntdll from Win7 x64
mov eax, X mov edx, 7FFE0300h call dword ptr [edx] ;ntdll.KiFastSystemCall retn Z	mov eax, X mov ecx, Y lea edx, [esp+4] call dword ptr fs:[0C0h] ;wow64cpu!X86SwitchTo64BitMode add esp, 4 ret Z

As you may see, on the 64-bits systems there is a call to fs:[0xC0] (wow64cpu!X86SwitchTo64BitMode) instead of a standard call to ntdll.KiFastSystemCall. wow64cpu!X86SwitchTo64BitMode is implemented as a simple far jump into the 64-bits segment:

	wow64cpu!X86SwitchTo64BitMode:
	748c2320 jmp     0033:748C271E   ;wow64cpu!CpupReturnFromSimulatedCode

That’s all magic behind switching x64 and x86 modes on 64-bits versions of Windows. Moreover it also works on non-WoW64 processes (standard native 64-bits applications), so 32-bits code can be run inside 64-bits applications. Summing things up, for every process (x86 & x64) running on 64-bits Windows there are allocated two code segments:

cs = 0x23 -> x86 mode
cs = 0x33 -> x64 mode

Running x64 code inside 32-bits process

At first I’ve prepared few macros that will be used to mark beginning and end of the 64-bits code:

#define EM(a) __asm __emit (a)

#define X64_Start_with_CS(_cs) \
{ \
	EM(0x6A) EM(_cs)                     /*  push   _cs                   */ \
	EM(0xE8) EM(0) EM(0) EM(0) EM(0)     /*  call   $+5                   */ \
	EM(0x83) EM(4) EM(0x24) EM(5)        /*  add    dword [esp], 5        */ \
	EM(0xCB)                             /*  retf                         */ \
}

#define X64_End_with_CS(_cs) \
{ \
	EM(0xE8) EM(0) EM(0) EM(0) EM(0)     /*  call   $+5                   */ \
	EM(0xC7) EM(0x44) EM(0x24) EM(4)     /*                               */ \
	EM(_cs) EM(0) EM(0) EM(0)            /*  mov    dword [rsp + 4], _cs  */ \
	EM(0x83) EM(4) EM(0x24) EM(0xD)      /*  add    dword [rsp], 0xD      */ \
	EM(0xCB)                             /*  retf                         */ \
}

#define X64_Start() X64_Start_with_CS(0x33)
#define X64_End() X64_End_with_CS(0x23)

CPU is switched into x64 mode immediately after execution of the X64_Start() macro, and back to x86 mode right after the X64_End() macro. Above macros are position independent thanks to the far return opcode.

It would be also useful to have ability to call x64 versions of APIs. I’ve tried to load x64 version of kernel32.dll but it is not trivial task and I’ve failed, so I need to stick only with the Native API. The main problem with 64-bits version of kernel32.dll is that there is already loaded x86 version of this library and x64 kernel32.dll have some additional checks that prevents proper loading. I believe that it is possible to achieve this goal through some nasty hooks that will intercept kernel32!BaseDllInitialize, but it is very complicated task. When I started this research, I was working on Windows Vista and I was able to load (with some hacks) 64-bits versions of kernel32 and user32 libraries but they were not fully functional, meanwhile I’ve switched to Windows 7 and method that was used on Vista isn’t working anymore.

Let’s back to the topic, to use Native APIs I need to locate x64 version of ntdll.dll in memory. To accomplish this task I’m parsing InLoadOrderModuleList from _PEB_LDR_DATA structure. 64-bits _PEB can be obtained from 64-bits _TEB, and obtaining 64-bits _TEB is similar to x86 platform (on x64 I need to use gs segment instead of fs) :

	mov   eax, gs:[0x30]

It can be even simpler, because wow64cpu!CpuSimulate (function responsible for switching CPU to x86 mode) moves gs:[0x30] value into r12 register, so my version of getTEB64() looks like this:

//to fool M$ inline asm compiler I'm using 2 DWORDs instead of DWORD64
//use of DWORD64 will generate wrong 'pop word ptr[]' and it will break stack
union reg64
{
	DWORD dw[2];
	DWORD64 v;
};

//macro that simplifies pushing x64 registers
#define X64_Push(r) EM(0x48 | ((r) >> 3)) EM(0x50 | ((r) & 7))

WOW64::TEB64* getTEB64()
{
	reg64 reg;
	reg.v = 0;

	X64_Start();
	//R12 register should always contain pointer to TEB64 in WoW64 processes
	X64_Push(_R12);
	//below pop will pop QWORD from stack, as we're in x64 mode now
	__asm pop reg.dw[0]
	X64_End();

	//upper 32 bits should be always 0 in WoW64 processes
	if (reg.dw[1] != 0)
		return 0;

	return (WOW64::TEB64*)reg.dw[0];
}

WOW64 namespace is defined in “os_structs.h” file that will be appended at the end of this post with the rest of sample sources.

Function responsible for locating 64-bits ntdll.dll will be defined as follows:

DWORD getNTDLL64()
{
	static DWORD ntdll64 = 0;
	if (ntdll64 != 0)
		return ntdll64;

	WOW64::TEB64* teb64 = getTEB64();
	WOW64::PEB64* peb64 = teb64->ProcessEnvironmentBlock;
	WOW64::PEB_LDR_DATA64* ldr = peb64->Ldr;

	printf("TEB: %08X\n", (DWORD)teb64);
	printf("PEB: %08X\n", (DWORD)peb64);
	printf("LDR: %08X\n", (DWORD)ldr);

	printf("Loaded modules:\n");
	WOW64::LDR_DATA_TABLE_ENTRY64* head = \
		(WOW64::LDR_DATA_TABLE_ENTRY64*)ldr->InLoadOrderModuleList.Flink;
	do
	{
		printf("  %ws\n", head->BaseDllName.Buffer);
		if (memcmp(head->BaseDllName.Buffer, L"ntdll.dll",
			   head->BaseDllName.Length) == 0)
		{
			ntdll64 = (DWORD)head->DllBase;
		}
		head = (WOW64::LDR_DATA_TABLE_ENTRY64*)head->InLoadOrderLinks.Flink;
	}
	while (head != (WOW64::LDR_DATA_TABLE_ENTRY64*)&ldr->InLoadOrderModuleList);
	printf("NTDLL x64: %08X\n", ntdll64);
	return ntdll64;
}

To fully support x64 Native API calling I’ll also need some equivalent of GetProcAddress, which can be easily exchanged by ntdll!LdrGetProcedureAddress. Below code is responsible for obtaining address of LdrGetProcedureAddress:

DWORD getLdrGetProcedureAddress()
{
	BYTE* modBase = (BYTE*)getNTDLL64();
	IMAGE_NT_HEADERS64* inh = \
		(IMAGE_NT_HEADERS64*)(modBase + ((IMAGE_DOS_HEADER*)modBase)->e_lfanew);
	IMAGE_DATA_DIRECTORY& idd = \
		inh->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
	if (idd.VirtualAddress == 0)
		return 0;

	IMAGE_EXPORT_DIRECTORY* ied = \
		(IMAGE_EXPORT_DIRECTORY*)(modBase + idd.VirtualAddress);

	DWORD* rvaTable = (DWORD*)(modBase + ied->AddressOfFunctions);
	WORD* ordTable = (WORD*)(modBase + ied->AddressOfNameOrdinals);
	DWORD* nameTable = (DWORD*)(modBase + ied->AddressOfNames);
	//lazy search, there is no need to use binsearch for just one function
	for (DWORD i = 0; i < ied->NumberOfFunctions; i++)
	{
		if (strcmp((char*)modBase + nameTable[i], "LdrGetProcedureAddress"))
			continue;
		else
			return (DWORD)(modBase + rvaTable[ordTable[i]]);
	}
	return 0;
}

As a cherry on top I’ll present helper function that will enable me to call x64 Native APIs directly from the x86 C/C++ code:

DWORD64 X64Call(DWORD func, int argC, ...)
{
	va_list args;
	va_start(args, argC);
	DWORD64 _rcx = (argC > 0) ? argC--, va_arg(args, DWORD64) : 0;
	DWORD64 _rdx = (argC > 0) ? argC--, va_arg(args, DWORD64) : 0;
	DWORD64 _r8 = (argC > 0) ? argC--, va_arg(args, DWORD64) : 0;
	DWORD64 _r9 = (argC > 0) ? argC--, va_arg(args, DWORD64) : 0;
	reg64 _rax;
	_rax.v = 0;

	DWORD64 restArgs = (DWORD64)&va_arg(args, DWORD64);

	//conversion to QWORD for easier use in inline assembly
	DWORD64 _argC = argC;
	DWORD64 _func = func;

	DWORD back_esp = 0;

	__asm
	{
		;//keep original esp in back_esp variable
		mov    back_esp, esp

		;//align esp to 8, without aligned stack some syscalls
		;//may return errors !
		and    esp, 0xFFFFFFF8

		X64_Start();

		;//fill first four arguments
		push   _rcx
		X64_Pop(_RCX);
		push   _rdx
		X64_Pop(_RDX);
		push   _r8
		X64_Pop(_R8);
		push   _r9
		X64_Pop(_R9);

		push   edi

		push   restArgs
		X64_Pop(_RDI);

		push   _argC
		X64_Pop(_RAX);

		;//put rest of arguments on the stack
		test   eax, eax
		jz     _ls_e
		lea    edi, dword ptr [edi + 8*eax - 8]

		_ls:
		test   eax, eax
		jz     _ls_e
		push   dword ptr [edi]
		sub    edi, 8
		sub    eax, 1
		jmp    _ls
		_ls_e:

		;//create stack space for spilling registers
		sub    esp, 0x20

		call   _func

		;//cleanup stack
		push   _argC
		X64_Pop(_RCX);
		lea    esp, dword ptr [esp + 8*ecx + 0x20]

		pop    edi

		;//set return value
		X64_Push(_RAX);
		pop    _rax.dw[0]

		X64_End();

		mov    esp, back_esp
	}
	return _rax.v;
}

Function is a bit long, but there are comments and the whole idea is pretty simple. The first argument is address of x64 function that I want to call, second argument is number of arguments that specific function takes. Rest of the arguments depends on the function that is supposed to be called, all of them should be casted to DWORD64. Small example of X64Call() usage:

DWORD64 GetProcAddress64(DWORD module, char* funcName)
{
	static DWORD _LdrGetProcedureAddress = 0;
	if (_LdrGetProcedureAddress == 0)
	{
		_LdrGetProcedureAddress = getLdrGetProcedureAddress();
		printf("LdrGetProcedureAddress: %08X\n", _LdrGetProcedureAddress);
		if (_LdrGetProcedureAddress == 0)
			return 0;
	}

	WOW64::ANSI_STRING64 fName = { 0 };
	fName.Buffer = funcName;
	fName.Length = strlen(funcName);
	fName.MaximumLength = fName.Length + 1;
	DWORD64 funcRet = 0;
	X64Call(_LdrGetProcedureAddress, 4,
		(DWORD64)module, (DWORD64)&fName,
		(DWORD64)0, (DWORD64)&funcRet);

	printf("%s: %08X\n", funcName, (DWORD)funcRet);
	return funcRet;
}

Running x86 code inside 64-bits process

It is very similar to the previous case with just one small inconvenience. Because 64-bits version of MS C/C++ compiler doesn’t support inline assembly, all tricks should be done in a separate .asm file. Below there are definitions of X86_Start and X86_End macros for MASM64:

X86_Start MACRO
	LOCAL  xx, rt
	call   $+5
	xx     equ $
	mov    dword ptr [rsp + 4], 23h
	add    dword ptr [rsp], rt - xx
	retf
	rt:
ENDM

X86_End MACRO
	db 6Ah, 33h			; push  33h
	db 0E8h, 0, 0, 0, 0		; call  $+5
	db 83h, 4, 24h, 5		; add   dword ptr [esp], 5
	db 0CBh				; retf
ENDM

Ending notes

Link to source code used in the article: http://rewolf.pl/stuff/x86tox64.zip

UPDATE:
All mentioned tricks (with necessary bugfixes and Windows 10 support) are currently part of the WoW64Ext library, that can be found on the github: https://github.com/rwfpl/rewolf-wow64ext

99 Comments

Ange July 4, 2011 at 08:46

you can also make ‘code’ that will be executed as both 32bits and 64bits.
and it’s even better if the opcodes are unique to both sides: http://code.google.com/p/corkami/source/browse/trunk/misc/MakePE/examples/asm/usermode_test.asm?spec=svn577&r=577#1877

Reply
elias July 4, 2011 at 15:08

Cool tricks, nice article! :)

Reply
Arkon July 4, 2011 at 20:32

Nice post!

Reply
m5home July 5, 2011 at 05:48

very good.

Reply
oxff July 5, 2011 at 13:40

Also see Hooking 32bit System Calls under WOW64. :)

Reply
Simon November 1, 2011 at 04:24

Thanks. I will try your code see if it works for me!

Reply
WoW64 Egghunter | Corelan Team November 18, 2011 at 12:03

[…] http://blog.rewolf.pl/blog/?p=102 […]

Reply
Reading memory of x64 process from x86 process January 12, 2012 at 22:54

[…] Probably the only way it can be done is to use hack that I’ve described few months ago (Mixing x86 with x64 code). In that case there will be need to get address of x64 version of NtReadVirtualMemory / […]

Reply
direct code injection x32 -> x64 - Page 2 May 3, 2012 at 21:17

[…] Some author wrote code for using x64 Native Apis in x86: http://blog.rewolf.pl/blog/?p=102#.T6LZV-tYvqV Sadly he couldn't get it working with kernel32 APIs. Reply With Quote […]

Reply

Sorry my lk of knowlwdge, but in the 64-bit syscall, what is the Y for? Afaik X is the syscall index and Y is the no of args, but theres no 32bit equivalent for Y? Thanks.

@ahajj
In this case Y is index to the table inside wow64cpu.dll. If I’m correct, it contains different versions of thunks that are responsible for arguments translation. Here is dump of this table:

dq offset TurboDispatchJumpAddressEnd
dq offset Thunk0Arg
dq offset Thunk0ArgReloadState
dq offset Thunk1ArgSp
dq offset Thunk1ArgNSp
dq offset Thunk2ArgNSpNSp
dq offset Thunk2ArgNSpNSpReloadState
dq offset Thunk2ArgSpNSp
dq offset Thunk2ArgSpSp
dq offset Thunk2ArgNSpSp
dq offset Thunk3ArgNSpNSpNSp
dq offset Thunk3ArgSpSpSp
dq offset Thunk3ArgSpNSpNSp
dq offset Thunk3ArgSpNSpNSpReloadState
dq offset Thunk3ArgSpSpNSp
dq offset Thunk3ArgNSpSpNSp
dq offset Thunk3ArgSpNSpSp
dq offset Thunk4ArgNSpNSpNSpNSp
dq offset Thunk4ArgSpSpNSpNSp
dq offset Thunk4ArgSpSpNSpNSpReloadState
dq offset Thunk4ArgSpNSpNSpNSp
dq offset Thunk4ArgSpNSpNSpNSpReloadState
dq offset Thunk4ArgNSpSpNSpNSp
dq offset Thunk4ArgSpSpSpNSp
dq offset QuerySystemTime
dq offset GetCurrentProcessorNumber
dq offset ReadWriteFile
dq offset DeviceIoctlFile
dq offset RemoveIoCompletion
dq offset WaitForMultipleObjects
dq offset WaitForMultipleObjects32
dq offset ThunkNone

waliedassar January 14, 2013 at 14:45

There are more ways to get the address of the corresponding 64-bit TEB.

1) The 64-bit TEB always precedes the corresponding 32-bit TEB by two pages.
AddressOf 64-bit TEB=(addressOf 32-bit TEB) – 0x2000

2) At offset 0xF70 from the start of 32-bit TEB is the address of corresponding 64-bit TEB. On the other hand, at offset 0 from the start of 64-bit TEB is the address of Corresponding 32-bit TEB.
See the “MmCreateTeb” function or have a look at http://pastebin.com/8ZQa2heh

Reply

Is it possible to do get ModuleHandle from another process so this could would work from 32bit process on x64 OS???

HMODULE GetNtDllModuleHandle(IN DWORD dwProcessId)
{
	HMODULE			hModule = NULL;
	MODULEENTRY32  me32;
 
	// Create a snapshot of modules in the process
	HANDLE hModuleSnap = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, dwProcessId);
	if(hModuleSnap == INVALID_HANDLE_VALUE)
	{
		_tprintf(_T("ERROR: Failed to retrieve module list.ntMake sure you run the 64-bit version on a 64-bit OS.n"));
		return FALSE;
	}
 
	// Iterate through the list to find ntdll.dll
	me32.dwSize = sizeof(me32);
	if(Module32First(hModuleSnap, &amp;me32))
	{
		do
		{
			if(!_tcsicmp(me32.szModule, _T("ntdll.dll")))
			{
				hModule = me32.hModule;
				break;
			}
		}
		while(Module32Next(hModuleSnap, &amp;me32));
	}
	CloseHandle(hModuleSnap);
 
	return hModule;
}

ReWolf February 15, 2013 at 22:10

@Salo
ntdll handle (address) is the same across all running processes, so there is no need to use Toolhelp32 API. On x64 OS every process has 64 bits version of ntdll, which means that 32 bits processes also have it. So if You want address of x86 ntdll, you may just call standard GetModuleHandle API from x86 process and if You want address of x64 ntdll You should use getNTDLL64() described in this post. It should work on x64 versions of XP/Vista/Win7, I’m not sure about Win8 as I’ve some reports that something has changed.

Reply
1. maniac July 17, 2013 at 16:22
  
  @ReWolf
  The address of ntdll.dll in Win8 is above 4GB, for example, mine is at 0x7ff8fbd0000 now, so it’s impossible to store the address in an DWORD(TEB64 is still at below 4GB).
  
  So those functions should be changed to handle 64bit address, include getNTDLL64, getLdrGetProcedureAddress, and X64Call.
  
  Reply
  1. ReWolf July 17, 2013 at 18:10
    
    Thanks for the info. As I said earlier, I still didn’t have chance to play with Windows 8, so I’m not aware of the changes made to WoW64 on that platform and I can’t fix this code without access to Win8 x64. I’ll probably fix it sooner or later, but it will require changing my current hardware which doesn’t support hardware virtualization. So, I’m aware that it doesn’t work on Win8, but currently I can’t do much about it.
    
    Reply

Gelbaerchen March 13, 2013 at 10:41

Hi ReWolf,

thanks for this detailed article. I’ve a question concerning a APC (asynchronous procedure call) from a x64 driver to a x86 application. My APC is already working from a x86 driver/OS to a x86 application and from a x64 driver/OS to a x64 application. But the mixed version crashes and I think I’d need some piece of x64 code in my x86 application that could be invoked by the APC. This piece of code could switch to 32 bit as mentioned above and afterwards call my “normal” 32 bit callback function. Do you think this could work? How could I embed the x64 code to my 32 bit app and how to call the 32 bit function from there?

Thanks and best regards,
Gelbaerchen

Reply
1. ReWolf March 13, 2013 at 20:15
  
  Unfortunately I don’t have enough knowledge about the ring 0 to answer your question, but I’ve asked friend of mine and he pointed me to PsWrapApcWow64Thread function. So maybe try to look some more information related to this function and hopefully you will find the answer.
  
  Reply
Epic June 1, 2013 at 18:29

Hi,

I am not the best at these but can you tell me how I can inject into a x64 process can you show me a code please.

Reply
1. ReWolf June 2, 2013 at 21:24
  
  I don’t posses such code to copy&paste it here, but I’m sure that you will find some ready to uses snippet on the internet. It should be also possible to do it with wow64ext library, by calling x64 version of NtCreateThread/NtCreateThreadEx syscalls.
  
  Reply
Swaggy August 16, 2013 at 14:33

I was looking back here then I figured out, that jmp to 0x33 would do the job rather than using multiple ASM instructions to go there.

I way from development environment so would this work.

Reply
1. ReWolf August 16, 2013 at 20:49
  
  If you know the exact address of your x64 code then direct far jump with segment set to 0x33 will work (exactly the same as wow64cpu!X86SwitchTo64BitMode), but if you want seamless transition, then multiple opcodes are a way to go.
  
  Reply
  1. Swaggy August 17, 2013 at 11:37
    
    @ReWolf
    Thanks! Buddy, it really helped me!
    
    I love this blog keep updating it. I am looking through it everyday, please update it. via adding new topics and such.
    
    Reply
Awk September 30, 2013 at 17:33

Hi,

Do you know how to locate CpupReturnFromSimulatedCode for Wow64.dll using this code or can we only use ntdll. If so how to find the address of this function? Using Code.

Thanks in Advanced. Nice Blogging useful for lot of programmers\developers.

Reply
1. ReWolf September 30, 2013 at 17:50
  It is pretty easy:
  
  In assembly:
  
  mov eax, dword ptr fs:[0xC0] mov eax, dword ptr ds:[eax + 1]
  mov eax, dword ptr fs:[0xC0] mov eax, dword ptr ds:[eax + 1]
  
  In C/C++:
  
  #include <intrin.h> DWORD simret = *(DWORD*)(__readfsdword(0xC0) + 1);
  #include <intrin.h> DWORD simret = *(DWORD*)(__readfsdword(0xC0) + 1);
  
  Reply
  1. Awk September 30, 2013 at 22:43
    
    @ReWolf
    Do you use X86SwitchTo64BitMode, to get to CpupReturnFromSimulatedCode becuase FS:[0XCO] has address X86SwitchTo64BitMode.
    
    What if it is hooked would the output stay the same or would be be something unexpected?
    
    Reply
  2. Awk September 30, 2013 at 23:20
    
    @ReWolf
    Hi,
    
    Thanks if X86SwitchTo64BitMode was hooked would this code still work because it gets memory from start of FS segment.
    If it was hooked would be get the real address or would we get the phoney address.
    If so how to unhook X86SwtichTo64BitMode.
    Thanks for quick replies
    Looking forward for replies.
    
    Reply
    1. ReWolf October 1, 2013 at 08:20
      
      If the hook is set by changing fs:[0xC0] address, then this method will not work. However, you can locate wow64cpu.dll in memory and get address of exported CpuSimulate function, then you need to find (it is at the end of CpuSimulate):
      
      41 FF 2E jmp fword ptr [r14]
      
      CpupReturnFromSimulatedCode is placed after that jump (at least on Vista and Win7).
      
      To get CpuSimulate address you can use code from my wow64ext library, just use GetModuleHandle64 and GetProcAddress64.
      
      Reply
      1. Awk October 1, 2013 at 16:51
        
        @ReWolf
        Why not just get address of CpuReturnFromSimulatedCode using GetModuleHandle64 and GetProcAddress64 rather than using CpuSimulate to get it. Is it not available or denied in Wow64.dll?
        
        Also thanks I like your answers.
        
        Reply
        
        ReWolf October 1, 2013 at 17:21
        
        CpupReturnFromSimulatedCode function isn’t exported, we know the name of this function only from PDB symbol file.
        
        Reply
Awk October 1, 2013 at 21:06
Hi,

Sorry for reply but it seems CpuSimulate does not exist in Wow64cpu.dll. I am using WIndows 7 x64 so I do have correct OS.

Anyway this code I tried to use (Compiled under x64 – Release Mode – VS2010 Ultimate):
#include #include using namespace std; int main() { LPVOID addr = GetProcAddress(GetModuleHandleA("Wow64cpu.dll"),"CpuSimulate"); cout<<addr; cin.get(); }
#include #include using namespace std; int main() { LPVOID addr = GetProcAddress(GetModuleHandleA("Wow64cpu.dll"),"CpuSimulate"); cout<<addr; cin.get(); }
Reply
1. ReWolf October 1, 2013 at 21:38
  
  If you compile this code as x64 executable, then there is no wow64cpu.dll in memory, because it is loaded only for x86 executables on x64 OS, so GetModuleHandle will fail. If you compile it as x86 it will also fail, because wow64cpu.dll is 64-bit DLL and it is not accessible through standard 32-bit GetModuleHandle/GetProcAddress. That’s why you need to use wow64ext library, it gives you access to 64bit NTDLL and WOW64 dlls from x86 application. So, to make it working:
  1) Target platform in Visual Studio set to x86
  2) Add wow64ext library to the project (http://code.google.com/p/rewolf-wow64ext/)
  3) Use GetModuleHandle64 and GetProcAddress64 instead of standard GetModuleHandle/GetProcAddress
  
  Reply
  1. Awk October 2, 2013 at 00:23
    
    @ReWolf
    Hmmm…Thanks about that. Although I did your method yet It fails:
    
    [edited, removed a lot of code :)]
    
    Thanks looking forward for a reply.
    
    Reply
    1. ReWolf October 2, 2013 at 18:37
      
      I’ve sent you an e-mail, because code that you’ve pasted got somehow messed up by wordpress.
      
      Reply
Oleg October 9, 2013 at 16:58

Hi, what about Windows 8?

Reply
1. ReWolf October 9, 2013 at 17:07
  
  I still haven’t tested it on Win8, but I’d some reports that it doesn’t work because of some changes. Can’t confirm it though.
  
  Reply
Oleg October 9, 2013 at 18:42

Thanks for reply.
Ok, can i use X64Call for call DWORD64 function address?

Reply
1. ReWolf October 9, 2013 at 19:34
  
  You’ll have to modify ‘func’ parameter to DWORD64 and it should probably work, but as I said, can’t confirm it without access to x64 version of Win8.
  
  Reply
  1. Oleg October 9, 2013 at 19:46
    
    @ReWolf
    I tried – the address is trimmed for some reason
    Address: for example – 0xFFFFFFFFFFFFFF
    Error: Access violation at address 0xFFFFFFFF
    
    Reply
    1. ReWolf October 9, 2013 at 20:10
      
      Unfortunately I’m not sure what could happened ;/
      
      Reply
Oleg October 9, 2013 at 19:41

And i have another questions:
1) What is the magic macro X64_Start_with_CS – how does it work? If you can in detail.
2) How it associated with call FS:[0C0h] instruction? Can it work without this call?

Reply
1. ReWolf October 9, 2013 at 19:58
  
  ad 1: it just changes value of CS (code segment) register to 0x33. On x64 Windows, this segment (0x33) is marked as x64 segment, thus we can execute x64 code after switch. Immediately after RETF (far return) instruction, CS is changed from 0x23 to 0x33.
  
  ad 2: It works without call fs:[0xC0]. I’m referring to this fs:[0xC0] only at the beginning just to show how I get this ‘CS=0x33 -> x64 mode’. It is all explained in the “x86 <-> x64 Transition” paragraph.
  
  Reply
Awk October 10, 2013 at 22:03

Hi,
1
Sorry again, but if I hook Wow64ServicesEx normally would the hook be placed or how to place the hook on it then? Can you please provide information, please. Can you snippet on how to just add a 0x64 jmp there to make sure that the hook can work. Please

Reply
1. ReWolf October 10, 2013 at 23:02
  
  I’m not sure if I understand correctly, You want to place hook on Wow64SystemServiceEx (inside wow64.dll) right?
  
  Reply
  1. Awk October 11, 2013 at 17:35
    
    @ReWolf
    Yes, you are correct – do have any idea on how to go on about it or can you show us a basic snippet of hook code patch.
    
    Reply
    1. ReWolf October 11, 2013 at 18:24
      
      You can hook it in the same way as x86 function. The only difference would be that your hook should switch to x86 mode, because this function is x64. So, let’s say for inline hook, you’ll need to store x64 registers (so nobody will mess with them) and execute X64_End(), then you can execute your x86 code and when you’re done, you have to switch back to x64 (X64_Start()) and restore registers.
      
      I’m not sure how wow64 layer will react if you call some x86 API from such hook (it will probably mess internal state of wow64), but for sure you can use x64 NT APIs from it.
      
      I’ll not write any ready-to-use snippet, because I’m short on time these days :)
      
      Reply
      1. Awk October 12, 2013 at 22:04
        
        @ReWolf
        Hi,
        
        Trying to hook x64 Ntdll exports such as NtOpenProcess and this is the code I made but it fails due to data misalignment or something along those lines:
        
        int main(){ DWORD s = GetProcAddress64(GetModuleHandle64(L"ntdll.dll"),"NtOpenProcess"); cout<<s; LPVOID sz = (LPVOID) s; LPVOID cake = Callback; HANDLE Handle = OpenProcess(PROCESS_ALL_ACCESS,false,GetCurrentProcessId()); DWORD dwOldProtect = {0}; VirtualProtect64(Handle,(PVOID*)s,(PULONG)5,PAGE_EXECUTE_READWRITE,&dwOldProtect); X64_Start(); *(BYTE*)(sz) = 0xEB; *(DWORD*)((LPBYTE)sz + 1) = ((DWORD)cake - ((DWORD)sz + 5)); X64_End(); cin.get(); }
        
        Thanks! Looking forward for your reply.
        
        Reply
        
        ReWolf October 15, 2013 at 19:25
        
        You’ve placed X64_Start() and X64_End() in wrong place. You should use those macros inside your Callback, so Callback should be defined as __declspec(naked), and at the beginning of Callback you need to put X64_End() to switch to x86 mode. Before call to the original function (probably at the end of Callback) you’ll need to put X64_Start().
        
        Reply
Awk October 16, 2013 at 20:14

Hi,

So did what you said exactly and I get a get a Access Violation. Code:

__declspec(naked) void Callback() { X64_End(); __asm mov eax, 0 //whNtCreateFile X64_Start(); }
int main() { DWORD64 s = GetProcAddress64(GetModuleHandle64(L"wow64cpu.dll"),"CpuSimulate"); cout<<s; LPVOID sz = (LPVOID) s; LPVOID cake = Callback; HANDLE Handle = OpenProcess(PROCESS_ALL_ACCESS,false,GetCurrentProcessId()); DWORD dwOldProtect = {0}; VirtualProtect64(Handle,(PVOID*)s,(PULONG)5,PAGE_EXECUTE_READWRITE,&dwOldProtect); *(BYTE*)(s) = 0xEB; *(DWORD*)(s) = ((DWORD)cake - ((DWORD)sz + 5)); cin.get(); }

Do you know why?

Thanks, I appreciate the help.

Reply
1. ReWolf October 16, 2013 at 23:38
  
  Judging from your code, I would suggest you to practice normal x86 hooks first, there are a lot of material on this topic. Above code has many drawbacks:
  
  – 0xEB is short jump, for this kind of hook you would probably want to use long jump 0xE9.
  
  – *(DWORD*)(s) = …; You’re overwriting previous 0xEB, probably you have forgot (s+1).
  
  – Your hook overwrites original code from CpuSimulate, you should store it somewhere and execute before you’ll go back to original CpuSimulate, which means that you will need 64bit length disassembler engine.
  
  As I said before, those hooks aren’t complicated, but to use it, you need to understand and master all concepts behind x86 hooks.
  
  Reply
  1. Awk October 17, 2013 at 17:58
    
    @ReWolf
    
    Hi,
    
    Yes sorry, I do hook in x86 very well, but here I crumbled sorry here I made it better:
    
    __declspec(naked) void Callback() { X64_End(); __asm mov eax, 0 //whNtCreateFile X64_Start(); } int main() { DWORD64 s = GetProcAddress64(GetModuleHandle64(L"wow64cpu.dll"),"CpuSimulate"); cout<<s; LPVOID sz = (LPVOID) s; LPVOID cake = Callback; HANDLE Handle = OpenProcess(PROCESS_ALL_ACCESS,false,GetCurrentProcessId()); DWORD dwOldProtect = {0}; VirtualProtect64(Handle,(PVOID*)s,(PULONG)5,PAGE_EXECUTE_READWRITE,&dwOldProtect); *(BYTE*)(s) = 0xE9; *(DWORD*)(s+1) = ((DWORD)cake - ((DWORD)sz + 5)); cin.get(); }
    
    Don’t worry about calling the real one back all I want is the EIP to hit Callback. From there forth I can carry on.
    
    Reply
    1. Awk October 18, 2013 at 18:46
      
      @Awk
      So why is the hook not taking place? Any clue
      
      Reply
      1. ReWolf October 18, 2013 at 21:39
        
        Sorry for late answer, I’ve to analyse it and I’ll give you the results soon.
        
        Reply
    2. ReWolf October 18, 2013 at 22:26
      
      Actually this code works pretty well on my Win7, I just changed VirtualProtect64 to standard VirtualProtectEx, as there is no VirtualProtect64 in my library. I’ve also changed CpuSimulate to TurboDispatchJumpAddressStart, because CpuSimulate seems to be rarely called and I just wanted to verify if the hook works.
      
      Reply
      1. Awk October 19, 2013 at 10:02
        
        @ReWolf
        Hi,
        
        No problem, man it happens at busy times, either way how did your code look?
        
        This is mine – it does not work:
        __declspec(naked) void Callback() { X64_End(); __asm mov eax, 0 //whNtCreateFile X64_Start(); }
        int main() { DWORD64 s = GetProcAddress64(GetModuleHandle64(L"wow64cpu.dll"),"TurboDispatchJumpAddressStart"); cout<<s; LPVOID sz = (LPVOID) s; LPVOID cake = Callback; HANDLE Handle = OpenProcess(PROCESS_ALL_ACCESS,false,GetCurrentProcessId()); DWORD dwOldProtect = {0}; VirtualProtectEx(Handle,(PVOID*)s,(PULONG)5,PAGE_EXECUTE_READWRITE,&dwOldProtect); *(BYTE*)(s) = 0xE9; *(DWORD*)(s+1) = ((DWORD)cake - ((DWORD)sz + 5));
        cin.get(); }
        
        Reply
        
        ReWolf October 19, 2013 at 10:16
        
        What exactly doesn’t work ? Have you tried to debug it ? Mine code is currently identical to your and it works.
        
        Reply
Madog October 17, 2013 at 02:21

Thanks for releasing this library! Its going to come in handy once I can fully understand whats all going on. I have a few questions if you don’t mind answering them.
1.) what is $+5 2.) why is there 4 EMIT(0) after each call to it
3.) why is there 3 EMIT(0) after the EMIT(_cs) in X64_End_with_CS?
4) when I looked up the opcode 44, it is INC SP but in the comments it reads mov dword[rsp + 4}, _cs. I don’t fully understand what is happening.

btw, where did you go to find the opcodes to the instructions? the site im using works for the most part but it seems to sketchy in some spots.

Thanks!

Reply
1. ReWolf October 17, 2013 at 17:06
  
  ad 1: $+5 means ‘current address + 5’, it is pretty standard trick used with ‘call’ to get current position of the code at runtime.
  
  ad 2: there are 4 zeroes, because 0xE8 call has relative addressing, so putting there DWORD equal to 0 means that it will call next instruction (current address + 5).
  
  ad 3: _cs parameter fits into 1 byte, so I need to pad this value to 4 bytes, as the mov opcode used at that place requires DWORD.
  
  ad 4: 0x44 decodes to ‘inc sp’ in 16-bit assembler, on 32-bits it would decode to ‘inc esp’, on 64-bits this 0x44 is REX prefix, used to manipulate various aspects of the opcode.
  
  Usually I’m using Intel Manuals, those are pretty comprehensive books :)
  
  Reply
  1. Madog October 17, 2013 at 18:06
    
    @ReWolf
    Thanks for the quick answers. They helped me understand the code a lot.
    Would you mind posting a link or upload the book your using? All the intel pdfs I’m finding don’t have the opcodes in them.
    
    Reply
  2. Madog October 18, 2013 at 05:21
    
    @ReWolf
    I found a pdf that has the opcodes in it, nvm about the earlier reply.
    I have two more questions. What does rt end up being at line 5? and line 7? why not just put 0xD in place of rt – xx?
    LOCAL xx, rt
    call $+5
    xx equ $
    mov dword ptr [rsp + 4], 23h
    add dword ptr [rsp], rt – xx
    retf
    rt:
    And whats the reason for using opcodes and not the mnemonics?
    db 6Ah, 33h ; push 33h
    db 0E8h, 0, 0, 0, 0 ; call $+5
    db 83h, 4, 24h, 5 ; add dword ptr [esp], 5
    db 0CBh
    
    Reply
    1. ReWolf October 18, 2013 at 18:07
      
      ad 1: ‘rt – xx’ is just easier, because I don’t need to know exact size of each opcode in the macro.
      
      ad 2: I’m using opcodes, because it is x86 code at that place, and those macros are for x64 MASM. Actually, for those specific opcodes it doesn’t matter, as the encoding is the same on both x86 and x64, but I haven’t checked it at the time that this post was written.
      
      Reply
      1. Madog October 18, 2013 at 20:25
        
        @ReWolf
        Thank you for answering my questions. I fully understand exactly what the code is doing and why.
        
        Reply
Awk October 19, 2013 at 14:36

Hi,

Thanks for this, got it at the end, the fault is on MY side, all I needed to do was get the EIP to my callback which I did by adding a extra OpenProcess after the patch. You are right!

Nice work though, you must be a programming genius.

Reply
1. Awk October 19, 2013 at 17:39
  
  @Awk
  By the way – I am trying to get the address of whNtCreateFile located in wow64.dll, but it outputs 0 but if you read this – http://www.ffri.jp/assets/files/research/research_papers/psj10-murakami_EN.pdf
  
  You will see it exists in Usermode, either there is a error with the GetProcAddress64 or this code is wrong (I hope it is not my code lol):
  
  DWORD64 s = GetProcAddress64(GetModuleHandle64(L"wow64.dll"),"whNtCreateFile");
  
  Reply
  1. ReWolf October 19, 2013 at 18:05
    
    This function exists in wow64.dll, but it is not exported, so you can’t get address of it without PDB symbols.
    
    Reply
    1. Awk October 19, 2013 at 23:29
      
      @ReWolf
      Thanks! I really appreciate. I think I am used to this library. This lib is amazing as it is programming natively.
      
      Reply
Ahmed July 25, 2014 at 23:39

Hi wolf

How do you debug the dll with PDB symbols which debugger is used to get the name of CpupReturnFromSimulatedCode function?

Thank you.

Reply
1. ReWolf July 26, 2014 at 09:33
  
  I’m using windbg x64, according to my knowledge it’s the only debugger that can step through x86<->x64 transitions without any problems.
  
  Reply
  1. Ahmed August 1, 2014 at 05:24
    
    @ReWolf
    You are right :(
    i wish there is another debugger like ollydbg that is capable of switching between the two architectures.
    
    I also have a question. I have read somewhere that if we jump to CPU 64 mode we can’t have any 64bit dll except ntdll so is it true? why we can not load a new dll normaly with LoadLibrary & GetProcAddress assuming it’s 64 bit version?
    
    Reply
    1. ReWolf August 1, 2014 at 08:26
      
      Actually, You can load x64 DLL with LdrLoadDll/LdrGetProcedureAddress, but it must not depend on any DLLs other than NTDLL. The main problem here are Kernel32 and User32, as those libraries makes some assumptions during initialization. I’ve wrote pretty much everything in one of the paragraphs in this post (look for kernel32!BaseDllInitialize). All in all, most things can be done within x86 boundaries anyway, those that cannot be are usually easy enough to implement them with Native API.
      
      Reply
Wow64（32位进程）注入DLL到64位进程 | 破晓的博客 January 11, 2015 at 14:16

[…] 向其他进程注入DLL通常的做法是通过调用CreateRemoteThread这个API在目标进程内创建一个远程线程，用这个线程来调用LoadLibraryA或LoadLibraryW（下文统称LoadLibrary）以实现让目标进程加载指定的DLL文件。使用CreateRemoteThread创建一个远程线程需要传入一个线程过程函数的地址，并且这个函数地址是需要在目标进程中有效的。由于LoadLibrary是kernel32.dll的导出函数，所以对于运行在同一个系统上的同为32位的进程或同为64位的进程可以假定彼此进程内的LoadLibrary函数的地址是相同的。并且CreateRemoteThread的线程过程函数和LoadLibrary的参数个数相同，且参数都是指针，因此通常都是直接将LoadLibrary作为CreateRemoteThread的过程函数。然后使用VirtualAllocEx在目标进程中分配内存，使用WriteProcessMemory往这块内存中写入DLL文件路径，将这块内存的地址作为线程过程函数（LoadLibrary）的参数。在64位的Windows操作系统上32位进程中的LoadLibrary函数地址与64位进程的函数地址不同，因此如果想对64位进程注入DLL，简单的做法就是使用64进程来执行注入工作。但是如果能让32位进程注入DLL到64位进程显然更好。在一番Google之后找到了这篇文章。这篇文章的作者研究出来一种在Wow64进程中执行x64代码的方法，并且将其封装成了这个库https://code.google.com/p/rewolf-wow64ext/。本文就是介绍如何使用这个库实现Wow64环境下32位进程向64位进程注入DLL。 […]

Reply
InsaneGod March 29, 2015 at 22:31

Your code causes a crash on Windows 10 Technical Preview build 10041 (x64).
soruce code used: http://rewolf.pl/stuff/x86tox64.zip
printf("LDR: %08X\n", (DWORD)ldr);
prints out an address which does not have any memory allocated to it(the address is not null, it just gives an invalid pointer to some random address).
on the page “PEB structure” on msdn it shows:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa813706(v=vs.85).aspx
“[This structure may be altered in future versions of Windows.]”
So I am assuming that they changed some things around?
Anyways, I tried to make my own implementation, but ldr is always NULL. Maybe you can tell me what is wrong?

[edit: snippet removed as it was quite big]

Reply
1. ReWolf April 6, 2015 at 20:37
  
  This article is quite old and I’m aware that some things doesn’t work even on Windows 8. However, wow64ext library is based on this article and it’s actively maintained so maybe You’ll better look at the wow64ext source code and hopefully You’ll find Your answers. Link to the library:
  http://github.com/rwfpl/rewolf-wow64ext
  
  Reply
2. uly November 25, 2016 at 20:06
  
  @InsaneGod
  
  No it has nothing to do with the the PEB being changed, and I say that because I tested the example here but with a different approach for obtaining the TEB address, and it worked fine in Windows 10. So the problem is definitely the mthod which was used for obtaining the TEB address (likely R12 no longer stores a pointer for the 64-bit ntdll in WOW64).
  
  Reply
3. uly November 25, 2016 at 20:09
  
  @InsaneGod
  
  Correction: R12 no longer stores a pointer to the TEB, not ntdll.
  
  Reply
  1. ReWolf November 25, 2016 at 23:08
    
    Well, I just checked wow64cpu.dll from the newest Windows 10 and the gs:[0x30] is still copied to r12 register, so it is still true that r12 has TEB address when the CPU is switched to x86 mode. The code inside x86tox64.zip is not really actual and it stopped working starting from Windows 8. In case anybody needs version that works for Windows 8+, I recommend checking wow64ext library, which is derived from this proof of concept, but contains further improvements for newer operating systems. This post is over 5 years old btw.
    
    Reply
    1. uly November 29, 2016 at 02:40
      
      @ReWolf
      
      Yeah I was aware that this post is pretty old. I bookmarked it 2-3 years ago for later reading and completely forgot about it until this weekend. :)
      
      Reply
Than August 17, 2015 at 22:54

Nice article though.
Can you please help me to get a sample .asm file for running x86 code inside x64? and how would you call these macro from a cpp file?

Reply
1. ReWolf August 18, 2015 at 09:42
  
  You can’t just call those macros from .cpp file as the x86 code must be placed between those macros. There are two use cases:
  
  – You want to run some x86 code that you wrote – compile x86 part, convert to hex and put it in a separate .asm file as DB data between X86_Start and X86_End macros. It’s because ml64 (MASM64) can’t compile x86 code natively.
  
  – You want to run already compiled x86 code (from some DLL) – manually map the DLL into process (of course if it will not load any x86 dependencies, so forget about it), prepare wrapper functions in separate .asm file, use mentioned macros and just call the desired x86 functions between those macros.
  
  As you can see, it is not ready-to-use copy&paste example, but I’m sure that you’ll manage.
  
  Reply
  1. Than August 18, 2015 at 20:26
    
    @ReWolf
    
    let me tell you the scenario here,
    
    #define IMPL_MERGETHUNK(T, ThunkF, IntfcF, Off)\
    __declspec(naked) inline HRESULT T::f##ThunkF()\
    {\
    __asm mov eax, [esp+4]\
    __asm mov eax, dword ptr [eax+4*Off]\
    __asm mov [esp+4], eax\
    __asm mov eax, dword ptr [eax]\
    __asm mov eax, dword ptr [eax+4*IntfcF]\
    __asm jmp eax\
    }
    
    The above inline assembly code which is currently compiling in X86, wants to do the same functionality in X64 as well. Since naked attribute does not support 64-bit, thought of keeping this whole code to run X86 mode. I mean between x86_start and x86_end during x64 build. is that make sense to have it like that? Will it be possible to have like that?
    
    Reply
    1. ReWolf August 19, 2015 at 12:51
      
      Hmm, if I correctly understand it doesn’t make sense. I would suggest moving thunk part to the separate x64 assembly file, but you will probably lose inlining in that case. You can’t run this code in x86 mode, because it is supposed to redirect execution to another x64 function (am I right?). If you run it as x86, it will just not work (pointers truncation, stack problems etc.). You may also try using compiler intrinsics (_AddressOfReturnAddress) but I’m not sure if it is 100% possible in this case.
      
      Reply
      1. Than August 19, 2015 at 20:26
        
        @ReWolf
        You are right, it points to another 64-bit function. The compile intrinsic also could not help us here.
        
        Leave that. I have another question, if possible please give me some tips to implement.
        
        I have 2 COM interfaces which are implemented in 2 different projects(consider Project A & B) . As part of 2nd project (B) , I need to have a 3rd interface (C) which needs to be inherited from those 2 interfaces (A & B). It is easy to inherit from project B interface since it has the implementation on same project itself. How can we inherit the project A interface here? Any idea please?
        Since interface A implementation is available in another projects.
        These interfaces are in separate IDL file.
        
        FYI., This is what they are trying to do through above assembly code.
        
        Reply
        
        ReWolf August 19, 2015 at 22:55
        
        Sorry, I’ve not much experience with COM interfaces ;/
        
        Reply
Jabir June 6, 2016 at 10:39

I have learned that GetModuleHandle64 can just get the handle of four dll, which is ntdll.dll, wow64.dll, wow64cpu.dll and wow64win.dll.
Is there some way to get handle of user-defined 64-bit DLL from 32-bit application?
Looking forward your replay.
Thanks.

Reply
1. ReWolf June 6, 2016 at 10:59
  
  There are only those four 64bits DLLs loaded in the wow64 process. If you load custom 64bit DLL to the wow64 process, you should be able to use GetModuleHandle64 for this dll as well. Please let me know in what exact scenario it doesn’t work and I’ll try to figure out what’s the problem.
  
  Reply
  1. Jabir June 6, 2016 at 11:33
    
    @ReWolf
    I have tried loading custom 64it DLL to a 32-bit process(x64 os) using LoadLibrary, but failed. The error code is 193, which means “not a valid Win32 application”.
    
    Reply
    1. ReWolf June 6, 2016 at 12:04
      
      You can’t do it with standard LoadLibrary. You can prepare X64Call wrapper to 64bit version of LdrLoadDll and load x64 DLL, but the DLL can’t depend on anything except NTDLL, because it’s not really possible (*) to load 64bit version of kernel32/user32/… inside wow64 process.
      
      (*) maybe it is possible, but the effort required to do it, is not worth.
      
      Reply
      1. Jabir June 7, 2016 at 08:35
        
        @ReWolf
        The LdrLoadDll seems doesn’t work on win10. I always get “Access Violation” error. Do you have any idea about this? Or maybe I should try other ways.
        
        Reply
        
        ReWolf June 8, 2016 at 08:39
        
        Generally 64bit LdrLoadDll should work (I’ll check it later on Win10).
        
        Reply
DLL Injection: Part Two – Nettitude Labs July 8, 2016 at 15:42

[…] It may be possible to work around this limitation and craft your 32bit injector code to switch into 64bit mode and then call SetWindowsHookEx, the technique is detailed in ReWolf’s blog. […]

Reply
学习 May 11, 2017 at 04:24

I wanna run x86 code in x64 mode, but it doesn’t work. i don’t know why, could u help me? here is the assembly:
X86_Start MACRO
LOCAL xx, rt
call $+5
xx equ $
mov dword ptr [rsp + 4], 23h
add dword ptr [rsp], rt – xx
retf
rt:
db 60h ;pushad
db 0B8h, 11h, 11h, 0, 0 ;mov eax,1111h
db 61h ;popad
ENDM

X86_End MACRO
db 6Ah, 33h ; push 33h
db 0E8h, 0, 0, 0, 0 ; call $+5
db 83h, 4, 24h, 5 ; add dword ptr [esp], 5
db 0CBh ; retf
ENDM
.CODE

Test PROC
MOV EAX, 1234h ;返回1234
X86_Start
X86_End
RET
Int_3 ENDP

when I call Test(), the process crashed

Reply
Michael July 3, 2017 at 13:09

Very nice article. Similar to @Jabir, I tried to load an x64 DLL to an x86 application using rewolf-wow64ext. I used snowie2000’s version (a pull request on your implementation ), who providing a LoadLibrayW64 function. Unfortunately, LoadLibraryW64 does not find my library.

What would be required to load an x64 DLL to an x86 application and call my “void_test()” function? Can you maybe provide some sample code/application?

extern "C" { void _declspec(dllexport) void_test() { std::cout << "awesome" << std::endl; } }

Reply
1. ReWolf July 20, 2017 at 23:38
  
  No idea about the version you are using, but my guess is that your x64 library has dependencies other than just NTDLL. In that case you will not be able to load it, because it is not so trivial to have both x86 and x64 win32 subsystems loaded and initialized (at least it is not possible with wow64ext library).
  
  Reply
  1. Michael August 15, 2017 at 15:38
    
    @ReWolf
    Ah okay, so wow64ext does not provide a framework for loading ANY x64 library in x86. Do you think that would be possible?
    
    Reply
    1. ReWolf September 11, 2017 at 21:16
      
      “Nothing is impossible”, but I guess wow64ext is not the right tool for the job, and there is still a question is it worth the time.
      
      Reply
【技术分享】DLL注入那些事 – 安百科技 August 24, 2017 at 03:35

[…] 不用说，为了安全起见，最好始终使用injectAllTheThings_32.exe注入32位进程或使用AllTheThings_64.exe注入64位进程。当然，您也可以使用injectAllTheThings_64.exe注入32位进程。其实我还没有实现这一点，但是我可能稍后会再试一次，你可以试着用WoW64鼓捣一下64位进程。Metasploit的smart_migrate基本上就是这种情况，具体请看这里。 […]

Reply
Deep Hooks: Monitoring native execution in WoW64 applications - Part 1 September 19, 2018 at 18:42

[…] (32-bit processes running on top of a 64-bit Windows platform). As documented by numerous other sources, WoW64 processes contain two versions of NTDLL. The first is a dedicated 32-bit version, which […]

Reply
MB October 7, 2018 at 00:35

DWORD64 NtOpenProcess64 = GetProcAddress64(GetModuleHandle64(L”ntdll.dll”), “NtOpenProcess”) + 3;
DWORD64 NtOpenProcess64_Return = NtOpenProcess64 + 5;

__declspec(naked) void NtOpenProcess64_Hook()
{
X64_End();
__asm
{ // Original opcode
mov eax, 0x00000026
jmp dword ptr [NtOpenProcess64_Return]
}
X64_Start();
}

DWORD dwOldProtect = { 0 };
VirtualProtectEx64(GetCurrentProcess(), NtOpenProcess64, 5, PAGE_EXECUTE_READWRITE, &dwOldProtect);
*(BYTE*)(NtOpenProcess64) = 0xE9;
*(DWORD*)(NtOpenProcess64 + 1) = (int)(((int)NtOpenProcess64_Hook – (int)NtOpenProcess64) – 5);

NtOpenProcess mov r10,rcx
NtOpenProcess+3 mov eax,00000026 <— Hook here

Do you happen to know whats wrong with this code, the hook is never applied.

Reply
MartinEr March 11, 2021 at 20:39

X64_Start() still working on win10pro 20h2 19042.804, co more callgates, as I used on win7.

Reply

x86 <-> x64 Transition

Running x64 code inside 32-bits process

Running x86 code inside 64-bits process

Ending notes

99 Comments

Leave a Reply Cancel reply