As promised I will now attempt to reverse engineer the remaining and most
difficult instruction.. I think I understand the other instructions... but I
might have to check up on those as well... just in case ;)
Here goes the most difficult one:
>From the intel instruction reference:
"
IMUL—Signed Multiply
"
That's interesting... this is like mul but for signed integers... hmmm...
who would have guessed ? I sure wouldn't :P
The instruction I need to reverse engineer looks like this:
IMUL EDX,[EBX].RandSeed,08088405H
I wonder what [EBX].RandSeed means.
It's really strange looking to me.
I think it's probably just a memory address.
So I think r/m32 is the correct parameter type ?
Which intel describes as:
"
r/m32 — A doubleword general-purpose register or memory operand used for
instructions
whose operand-size attribute is 32 bits. The doubleword general-purpose
registers are:
EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI. The contents of memory are found at
the
address provided by the effective address computation. Doubleword registers
R8D - R15D
are available when using REX.R in 64-bit mode.
"
I searched for "effective address computation" in the document to see if
that means anything... but nothing found.
So I guess it's just the memory address of the variable or so.
Another possiblity could be:
"
imm32 — An immediate doubleword value used for instructions whose
operandsize
attribute is 32 bits. It allows the use of a number between +2,147,483,647
and
–2,147,483,648 inclusive.
"
But that's surely not it... because that's like passing a constant value...
so that's not it.
Another possiblity could be:
"
m32 — A doubleword operand in memory, usually expressed as a variable or
array name,
but pointed to by the DS:(E)SI or ES:(E)DI registers. This nomenclature is
used only with
the string instructions.
"
Neh probably not.. besides it mentions only for string instructions.
Another possiblity:
"
moffs8, moffs16, moffs32, moffs64 — A
type byte, word, or doubleword used by some
address is given by a simple offset relative to
in the instruction. The number shown with moffs
the address-size attribute of the instruction.
"
Could be... but I don't quite understand this one...
Maybe the other instruction forms will clearify if this is a possiblity.
There are even other possibilities... so let's stop with looking at these
"parameter types".
And let's start looking at the instruction prototype which matches the
assembler code closest.
IMUL EDX,[EBX].RandSeed,08088405H
First I'll take a look at Delphi 2006's assembler help to see if it can
explain this weird notation a little bit better.
Well this text explains it a bit though the example is probably still a
little bit confusing... maybe using pointers to variables is a bit way to
explain it... I am not sure though:
What is a bit doubtfull is: ECX,[Start]
It says:
"in this case, the word at offset 10 in the data segment"
What is located at offset 10 ?
That's not really part of the example.
So apperently it's referencing an arbitrary position.
It would have been more clear if the example contained another variable...
and showed how to retrieve the address of that variable.. and make ECX point
to it etc..
I think that should be possible ;)
However the real address value is ofcourse only known after everything is
compiled... so simply pointing to offset 10 is pretty meaningless I guess.
I wonder if this is possible as well:
mov ECX, [1235234]
Using an immediate value ;) would that be possible as well ?
Well I don't really care at this moment.. ;)
Anyway let's move on...
"
Expression Classes The built-in assembler divides expressions into three
classes: registers, memory references, and immediate values. An expression
that consists solely of a register name is a register expression. Examples
of register expressions are AX, CL, DI, and ES. Used as operands, register
expressions direct the assembler to generate instructions that operate on
the CPU registers. Expressions that denote memory locations are memory
references. Delphi's labels, variables, typed constants, procedures, and
functions belong to this category. Expressions that aren't registers and
aren't associated with memory locations are immediate values. This group
includes Delphi's untyped constants and type identifiers. Immediate values
and memory references cause different code to be generated when used as
operands. For example,
const
Start = 10;
var
Count: Integer;
.
.
.
asm
MOV EAX,Start { MOV EAX,xxxx }
MOV EBX,Count { MOV EBX,[xxxx] }
MOV ECX,[Start] { MOV ECX,[xxxx] }
MOV EDX,OFFSET Count { MOV EDX,xxxx }
end;
Because Start is an immediate value, the first MOV is assembled into a move
immediate instruction. The second MOV, however, is translated into a move
memory instruction, as Count is a memory reference. In the third MOV, the
brackets convert Start into a memory reference (in this case, the word at
offset 10 in the data segment). In the fourth MOV, the OFFSET operator
converts Count into an immediate value (the offset of Count in the data
segment). The brackets and OFFSET operator complement each other. The
following asm statement produces identical machine code to the first two
lines of the previous asm statement.
asm
MOV EAX,OFFSET [Start]
MOV EBX,[OFFSET Count]
end;
Memory references and immediate values are further classified as either
relocatable or absolute. Relocation is the process by which the linker
assigns absolute addresses to symbols. A relocatable expression denotes a
value that requires relocation at link time, while an absolute expression
denotes a value that requires no such relocation. Typically, expressions
that refer to labels, variables, procedures, or functions are relocatable,
since the final address of these symbols is unknown at compile time.
Expressions that operate solely on constants are absolute. The built-in
assembler allows you to carry out any operation on an absolute value, but it
restricts operations on relocatable values to addition and subtraction of
constants
"
Finally here is a short description:
"
[... ] Memory reference. The expression within brackets is evaluated
completely prior to being treated as a single expression element. Another
expression can precede the expression within the brackets; the result in
this case is the sum of the values of the two expressions, with the type of
the first expression. The result is always a memory reference.
"
I know what the dot means.. but just in case another reader doesn't
understand here is a short description as well:
"
. Structure member selector. The result is the sum of the expression before
the period and the expression after the period, with the type of the
expression after the period. Symbols belonging to the scope identified by
the expression before the period can be accessed in the expression after the
period.
"
Ofcourse there is much more to it then these short descriptions...
I am not sure if assembler supports types as well or if basm works with
types etc... there is some mention about types etc.
Now back to the code...
"
IMUL EDX,[EBX].RandSeed,08088405H
"
[EBX] doesn't seem like a type or anything like that.
To me this looks like a reference. Like a pointer to something.
Since EBX is zero.
It's probably the top or bottom of some kind of data segment or so.
Top vs bottom that's not important right now.. it's probably the start...
that's most important.
The dot is the member selection.
So that could mean that RandSeed is part of the DataSegment.
So this text "[EBX].RandSeed" could be translated into:
"DataSegment.RandSeed"
I don't really know what a DataSegment is... or if it's even a
DataSegment...
I don't know if a DataSegment is global/unique... or that each module/unit
has a DataSegment.
It's doesn't really matter what it is...
What matters is that it simply points to RandSeed. I know what that is ;)
That's a variable located in the system unit declared in/after the interface
section, so that means it's a public unit variable.
I still wonder if delphi uses one global data segment, or one data segment
per unit... or maybe some other wacky number.
Anyway back to the code:
IMUL EDX,SomeWackyDataSegmentOrSomething.RandSeed,08088405H
Hmm now I notice something.
There are 3 comma's... so that means there are 3 parameters ?! wow ! Didn't
know assembler could have 3 parameters... gje.
Well maybe this makes finding the correct prototype in the manual easier
let's see:
Yeah fortunately... the number of prototypes/overloads so to speak is very
limited...
My first pick is this one:
I just realized something... maybe I am taking the long tour/wrong approach
to this reverse enginering problem.
If I simply run the code and halt the program at the specific assembler
instruction I can see the opcodes... and maybe then I can find the correct
prototype in the manual much quicker... the funny thing is that with my
current code... I could not reach the random function because of a range
check error.
Ofcourse I could have made a new application/test program which didn't
produce the range check error... or I could even have modified the parameter
on the fly and proceed with a lower value preventing any range check errors,
thus safing me from having to recreate a test program.
But first here is what I selected from the manual:
"
69 /r id IMUL r32, r/m32,
imm32
Valid Valid doubleword register ? r/m32 *
immediate doubleword.
"
Now I'll try the other approach... running the program, modifieing the value
and trying to decode the instruction opcodes hoping to find the correct
prototype faster that way.
Well that attempt failed. For some reason I can't change the value of
Self.mTransferSize. It says invalid expression in Delphi 2006. Maybe it has
to do with TfileStream.FileSize being related to it.. or maybe it's because
I am in a Tthread... strange. I thought changing values on the fly was
possible in delphi.
Hmm strange... I'll have to look into that some more later...
Besides I don't really like this approach.. it's kinda risky to modify
variables on the fly... could be dangerous.. so let's stop with that.
Back to the manual decoding etc.
Yeah.. though another approach is the already mentioned approach simply
using a test program... that's a good/safe way...
So I'll give that way a try to see if reverse enginering while knowing the
instruction opcodes is faster...
So here is a little test program to dig deeper into random function and
place break point and watch cpu window etc:
program Project1;
{$APPTYPE CONSOLE}
uses
SysUtils;
var
mTransferSize : int64;
mOffset : int64;
I : integer;
begin
mTransferSize := 1024*1024*10; // 10 MB ;) well in range of 32 bits.
for I := 1 to 10 do
begin
mOffset := Random( mTransferSize );
writeln( mOffset );
end;
writeln( 'press enter to continue' );
readln;
end.
The first value returned was zero... so that troubled me some... so I added
a loop... and the random function seems to be working just fine...
Maybe the randseed should first be initialized to prevent a zero value...
which loooks kinda strange... but that doesn't matter right now.. it;s
working ok and that's what's most important.
Now going to place breakpoint in random function at the IMUL instruction.
"Debug with dcu's" must be on otherwise it won't stop at the breakpoint ;)
Now the moment of thruth for assembler experts reading this and wondering if
they knew the correct instruction opcode.
Viewing the cpu window tells me that the instruction opcode issssss:
System.pas.3988: IMUL EDX,[EBX].RandSeed,08088405H
00402D2F 699308B040000584 imul edx,[ebx+$0040b008],$08088405
Excccccellllent DELPHI 2006...
I love these people.
It seems these people have listened to my or other people's complaints or
quality/feature request.
In delphi 7 at least... it wasn't possible to copy anything from the cpu
window... so I had to type everything manually... that fucked/sucked big
time...
But now with Delphi 2006 I can just lazyly copy and paste the damn
instruction and everthing else in that same window too. Haven't tried the
other windows yet..
Let's see if they were smart enough to make the copy command/functionality
available for the other debug windows as well.
Here goes:
00000000
Hmmmm not quite...
The window shows:
EBX 00000000
But when trying to copy & paste it it only shows:
00000000
Well that's not too shabby.
It's still not quite as good as I would like it to be.
I still cannot select all the registers...
So I would still have to copy paste them individually.. which is still
fucking bad and sucking fucked.
Let's see what happens if I tried to copy paste multiple instructions:
System.pas.3743: fild qword ptr [eax]
00402D20 DF28 fild qword ptr [eax]
System.pas.3744: fistp qword ptr [edx]
00402D22 DF3A fistp qword ptr [edx]
System.pas.3745: end;
00402D24 C3 ret
00402D25 8D4000 lea eax,[eax+$00]
System.pas.3944: Byte(s^[0]) := newLength; // should also fill new space
00402D28 8810 mov [eax],dl
System.pas.3945: end;
00402D2A C3 ret
00402D2B 90 nop
System.pas.3976: PUSH EBX
00402D2C 53 push ebx
System.pas.3987: XOR EBX, EBX
00402D2D 31DB xor ebx,ebx
System.pas.3988: IMUL EDX,[EBX].RandSeed,08088405H
00402D2F 699308B040000584 imul edx,[ebx+$0040b008],$08088405
System.pas.3989: INC EDX
00402D39 42 inc edx
System.pas.3990: MOV [EBX].RandSeed,EDX
00402D3A 899308B04000 mov [ebx+$0040b008],edx
System.pas.3992: MUL EDX
00402D40 F7E2 mul edx
System.pas.3993: MOV EAX,EDX
00402D42 89D0 mov eax,edx
System.pas.3994: POP EBX
00402D44 5B pop ebx
Ok... that's definetly possible...
Well a little bit of inconsistency there...
Also multiple memory/data locations can't be selected.
So delphi 2006 could still be improved...
It's kinda funny and amazing that "quality assurance" missed this.
(Hey mister Q&A dumbass wake up ok ? ;))
Well at least the instructions can be fully copied... so I am greatfull for
that... sure am... makes me writing this post a lot easier ;)
Now let's get back to the mission at hand...
Finding the correct prototype in the manual.
Gje... so much work...
I almost wished it was automated.
That would be a pretty cool and neat future for a very advanced Delphi 2007
;)
Automatic look-up of assembler instructions.
Yeeeeeah neat feature...
Me gonna request it in a seperate thread... after I am done with this
post...
So that I will never ever have to go though all this stuff... and looking at
the manual.. finding the correct prototype... takes much much much time
man... oh man, sure does.
Automate it ! yeah that's
ggggggggggooooooooooooddddddddddddddddddddddddddddddddddd.
Anyway back to the time intensive manual labor.
Now I must go decode this opcode.
System.pas.3988: IMUL EDX,[EBX].RandSeed,08088405H
00402D2F 699308B040000584 imul edx,[ebx+$0040b008],$08088405
Let's first check if this opcode or part of it happens to be mentioned in
the manual section under IMUL.
It says 69 /r id
I think it's safe to say: "JACKPOTT" =DDDDD
Yup I selected the correct prototype.
Here is is one more time:
"
69 /r id IMUL r32, r/m32,
imm32
Valid Valid doubleword register <- r/m32 *
immediate doubleword.
"
Now back to this code:
IMUL EDX,[EBX].RandSeed,08088405H
First the translation of IMUL instruction:
EDX := [EBX].RandSeed * 08088405H;
Ah finally I have done it.
Hopefully I understood the rest of the instruction as well..
I will now attempt to write a pure delphi version of this assembler stuff
and see if both versions compare same results ;)
First I am gonna test if delphi already has the mentioned feature by
pressing F1 on the assembler instruction in the cpu window.
When I press F1, I get a help screen which explains the cpu window stuff...
that's all kinda nice and everything but kinda inconsistent with how editor
F1 works... but then again the CPU window is not really an editor... but
still I will request this feature because it could be handy ;) maybe even
very handy ;) for understand the assembler better !!! and quicker !!!
And then me make pascal version of random function.
Bye,
Skybuck.
Received on Mon May 1 02:05:27 2006