// Andy Friesen --

Ghetto Closures in C++ II: __thiscall

2009-02-09 00:00:00 +0000

Today, we’re going to extend our little closure library to support the __thiscall calling convention.

After some poking around, I have discovered that some guy has figured this out already. Rad! I’m going to go over it quickly anyway so that I can build on it for tomorrow.

First, the setup/demo part changes a bit:

struct I {
    virtual void print() = 0;
};

struct C : I {
    C(int x)
        : x(x)
    { }

    void print() {
        printf("My x is %in", x);
    }

    int x;
};

typedef void (__stdcall *Function0)();

It turns out that __thiscall and __stdcall are not all that different. All you need to do is stuff your this pointer in ECX, and you’re set. Our satanic little blob of assembly changes to:

__asm {
    mov ecx, my_this        // B9 xx xx xx xx
    mov eax, real_func      // B8 yy yy yy yy
    jmp eax                 // FF E0
}

As before, we just have to create a block of memory, plug the code and pointers into it, and execute it:

// YEEHAW
*((I**)(code + 1)) = instance;
*((void**)(code + 6)) = really_reinterpret_cast<void*>(&I::print);

void* buffer = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(buffer, &code, sizeof(code));
FlushInstructionCache(GetCurrentProcess(), buffer, sizeof(code));

Function0 f0 = reinterpret_cast<Function0>(buffer);

really_reinterpret_cast is a hack because we are doing Very Bad Things. The C++ standard says that you cannot convert pointer-to-methods to any other kind of pointer (not even void*!). I was unable to dig up the exact reasoning, but I think it is because the C++ implementation is allowed to make pointer-to-members have any format it wants. They don’t even have to be the same size as a normal pointer.

But that’s boring. :D I doubt Microsoft will change how this works any time soon now that existing code depends on it, and there is potential here to do something that is very useful.

It turns out that the unspecified implementation that they chose was for a pointer-to-member-function to either point to the code directly (like a stdcall function), or to point to a thunk that will go to the right place, if the method is virtual. In other words, we can just jump to it and we will be jumping to the right place.

Here’s my evil, standard-subverting cast:

template <typename D, typename S>
D really_reinterpret_cast(S s) {
    char __static_assert_that_types_have_same_size[sizeof(S) == sizeof(D)];

    union {
        S s;
        D d;
    } u;

    u.s = s;
    return u.d;
}

That weird looking char array is an ad-hoc compile-time assertion. A C array of length 0 is illegal, so if sizeof(S) != sizeof(D), then the compile will fail.

And, as before, the sweet thrill of victory:

Function0 f0 = reinterpret_cast<Function0>(buffer);
f0();

woo.