18 August 2009

Obtaining the name of the actual procedure (2)

Does a compiled GFA-BASIC 32 application contain debug information? I got interested and disassembled a compiled (EXE) GFA-BASIC 32 application.

I immediately noted why Avira anti-virus complained about compiled GFA-BASIC 32 programs in the past. The GFA-BASIC 32 EXE application start-up code is quite different from ‘normal’ C/C++, VB, and Pascal applications. In fact, there is no such start-up code! The compiled program immediately calls the exported DLL function ‘GfaWin23_5()’ from the GfaWin23.Ocx runtime. The DLL function calculates the start-address of the main program part and executes it. The entire program is executed from inside the GfaWin23_5() function which contains all start-up and program-exit instructions.

My main concern was ‘how is the stack frame of a compiled procedure initialized?’ More on stack frames see http://gfabasic32.blogspot.com/2009/08/obtaining-name-of-actual-procedure.html.

It turns out that the runtime DLL contains two additional INITPROC() functions; I called them INITPROC_EXE() because the call to INITPROC() in the debug version has been replaced to a call to a EXE specific initialization procedure. It still creates a stack frame, but it is a bit smaller because it lacks debug information. In addition, the EXE doesn’t contain symbol information whatsoever. There is no way an EXE can return information about symbols; procedure and variable names and their locations.

15 August 2009

Obtaining the name of the actual procedure

Obtaining information about a currently executing procedure is only possible when the program is running in the IDE. One of the main differences of the debug-version and the stand-alone version (EXE) is the storage of debug information. Sounds logical, doesn't it?
To get a program compiled, the compiler collects all identifiers (procedure names, global and local variables, labels, etc.). The compiler creates memory locations for the variables and compiles source code to object code (machine code instructions). Each procedure gets its own portion of memory whose starting addresses are stored in a table. After compilation the procedure and function calls are corrected to make sure they call the correct object code addresses. After the compiler finishes the connection between an identifier-name and its location is broken. The call to a subroutine in the source code is replaced by a call to a memory address. A reference to a variable is replaced by a reference to a memory location. The compiler creates machine language instructions and uses hardcoded memory locations. There is no room for the original name of the identifiers, be it subroutine names or variable names.
 
However, when a program is compiled inside the IDE (F5) the compiler inserts debug information. How exactly the compiler handles this process I don't know. I do know however, that each subroutine (non-Naked) calls the INITPROC() library routine inside the GfaWin23.Ocx. This library function creates a stack frame for the subroutine. Each function has a stack frame and most compilers create them the similar. Usually you'll find these instructions to create a stack frame:
 
push ebp
mov ebp, esp
 
The INITPROC() function installs a much larger stack frame. The first 96 bytes are used to store an extended exception record to make sure local variables are cleared properly in case of a problem. In GFA-BASIC 32 the use of structural exception handling (SEH) is similar as in C/C++.  The SEH-technique allows to store more information than C/C++ does. GFA-BASIC 32 cleverly uses this freedom to store additional debug information. One of the members of the stack frame contains a 'pointer' to the procedure information that is stored in a table by the compiler. The compiler doesn't delete this (hash) table after it has finished compiling. The GFA-BASIC 32 debug commands get there information from this stack frame member and essentially they use one function to process this stack frame member: CallTree(). CallTree figures out the stack frame of the current subroutine and it is able to figure out the extended stack frames of subroutines that called the subroutine. (It uses the SEH-list to do this.)
 
The first entry of the string that CallTree returns is the procedure that is currently executing. The name of the actual procedure is obtained by invoking:
 
Print CallTree(1)
 
but it only works in the IDE.

14 August 2009

After a good holiday

I hope you had a good holiday, we had. We rented a house at the border of a fjord in Norway, got ourselves a small boat and went out fishing everyday. We definitely want to get a small “hytte” over there.

During the holidays I got some mail at gfabasic32@gmail.com. Some were personal opinions about GFA-BASIC in general and some about GFA-BASIC 32 specifically. Mostly I reply these messages as soon as possible, but sometimes I save them for further examination. In general everybody gets a reply. If you didn’t got one yet, please repost your message.

A new season, a new start. Hope to post more blog entries this season.