0x20


A blog about mobile, maker, and embedded development.


__fastcall

I was doing a little 8086 assembly the other day while on the train to California when I came up with this neat little optimization. I was looking at a sort of "hello world" example of implementing a "power" function in assembly that looked like thus:

int __stdcall power( int num, int power)
{
   __asm
   {
      mov eax, num;
      mov ecx, power;
      shl eax, cl;
   }
}

Nothing revolutionary there. The method provides one "exotic" feature which is no return statement instead simply leaving the result of the function in the EAX register which is used as the return value of the function. What I then started to wonder is "what would this same method look like if using the __fastcall calling convention?" To start with the signature would look like thus:

int __fastcall power( int num,   //ECX REGISTER
                      int power )//EDX REGISTER
{

Now since shl uses the ECX register as the number of bits to shift we can better write our function signature as thus:

int __fastcall power( int power,  //ECX REGISTER
                      int num    )//EDX REGISTER
{

So now when we enter the __asm block we already have the operation data in the proper 32 bit registers (more on this in a second) and are ready to shl:

   __asm
   {
      shl edx, cl; ECX = ECX * (2 to the power of dl );

Now we just move the result into the EAX register and we are done, in two assembly instructions (minus the function overhead of course). Now as I hinted at before there is one thing to be careful of and that is that the __fastcall will only guarentee the first two DWORD arguments (or smaller) of the argument list will be in the registers (on a 32bit CPU; I need to update this for 64bit) so to ensure that the ECX and EDX are as we expect we should write this method as such:

int __fastcall power( DWORD power,  //ECX REGISTER
                      DWORD num    )//EDX REGISTER
{
   __asm
   {
      shl edx, cl; ECX = ECX * (2 to the power of dl );
      mov eax, edx;
   }
}

Now Microsoft has stated at times that __asm blocks should not be used inside of __fastcall methods but this is only because __fastcall uses the utility registers ECX and EDX and so it is assumed that the __asm block will confuse the compiler should it modify these registers before the compiler uses the first two arguments in the fastcall method. However, since we are programming against the __fastcall specification I see no reason why the above example could not be considered safe.