Program analysis with gdb

The gdb debugger allows us to observe and analyse program execution step-by-step. We illustrate, through examples, the main functionalities of gdb.

A simple example

Consider again our simple C program.

#include  
int main()
{
	int i;
	for (i=0; i<10; i++)
		printf("%d ",i);
	printf("\n");
}

If we compile it with the -g option we have additional information on the source code, directly available from gdb.

testbed@testbed:~/lab1$ gcc -g hello.c -o hello

Listing and disassembling

We can now run gdb and issue command list to show the source code (thanks to the -g option given to gcc)

r1x@testbed ~/gdb $ gdb -q hello                                                                         
Reading symbols from hello...done.
(gdb) list
1       #include  
2       int main()
3       {
4               int i;
5               for (i=0; i<10; i++)
6                       printf("%d ",i);
7               printf("\n");
8       }
(gdb) 

We now want to disassemble the program. Remember we are using the Intel syntax for assembly. In gdb we can set Intel syntax as follows:

(gdb) set disassembly intel

NOTE: This can be made the default syntax by executing the following (in the home directory) so that we don't have to issue set disassembly intel all the time:

echo "set disassembly intel" > ~/.gdbinit

By issuing disassemble main we can now show the assembly code of function main.

(gdb) disassemble main
Dump of assembler code for function main:
   0x0804844a <+0>:     lea    ecx,[esp+0x4]
   0x0804844e <+4>:     and    esp,0xfffffff0
   0x08048451 <+7>:     push   DWORD PTR [ecx-0x4]
   0x08048454 <+10>:    push   ebp
   0x08048455 <+11>:    mov    ebp,esp
   0x08048457 <+13>:    push   ecx
   0x08048458 <+14>:    sub    esp,0x14
   0x0804845b <+17>:    mov    DWORD PTR [ebp-0xc],0x0
   0x08048462 <+24>:    jmp    0x804847b 
   0x08048464 <+26>:    sub    esp,0x8
   0x08048467 <+29>:    push   DWORD PTR [ebp-0xc]
   0x0804846a <+32>:    push   0x8048530
   0x0804846f <+37>:    call   0x8048310 
   0x08048474 <+42>:    add    esp,0x10
   0x08048477 <+45>:    add    DWORD PTR [ebp-0xc],0x1
   0x0804847b <+49>:    cmp    DWORD PTR [ebp-0xc],0x9
   0x0804847f <+53>:    jle    0x8048464 
   0x08048481 <+55>:    sub    esp,0xc
   0x08048484 <+58>:    push   0xa
   0x08048486 <+60>:    call   0x8048340 
   0x0804848b <+65>:    add    esp,0x10
   0x0804848e <+68>:    mov    ecx,DWORD PTR [ebp-0x4]
   0x08048491 <+71>:    leave  
   0x08048492 <+72>:    lea    esp,[ecx-0x4]
   0x08048495 <+75>:    ret    
End of assembler dump.

Breakpoints

gdb allows for defining breakpoints: points in the program where execution will stop so that we can inspect the state. This can be done, for example, by issuing break main which will put a breakpoint at the beginning of main function code.

(gdb) break main
Breakpoint 1 at 0x804845b: file hello.c, line 5.

We note that breakpoint is set to instruction main+17. Previous instructions are the function prologue that initialize the stack (these can vary depending on the compiler).

NOTE: To set a breakpoint at a specific address you need to use the syntax break *addr. For example break *0x804845b.

We can now run the program by issuing run

(gdb) run
Starting program: /home/r1x/gdb/hello 
Breakpoint 1, main () at hello.c:5
5               for (i=0; i<10; i++)

Once we are in the middle of a program execution we can inspect the state of processor and memory as described below.

Inspecting registers

Registers can be inspected by issuing info registers

(gdb) info registers
eax            0x1      1
ecx            0xbffffa00       -1073743360
edx            0xbffffa24       -1073743324
ebx            0xb7fcf000       -1208160256
esp            0xbffff9d0       0xbffff9d0
ebp            0xbffff9e8       0xbffff9e8
esi            0x0      0
edi            0x8048350        134513488
eip            0x804845b        0x804845b 
eflags         0x282    [ SF IF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51

We have illustrated registers in the previous class.

We observe that the instruction pointer eip contains value 0x804845b which is, in fact, where we placed the breakpoint. So the execution has stopped just before the first instruction of our C program, as expected.

NOTE: We can inspect a single register by putting its name after info register. For example:

(gdb) info register eip
eip            0x804845b        0x804845b 

Examining memory

We can examine memory locations using the 'examine' (or just 'x') command. When we issue this command we can specify a format separated by /. For example, if we write x/x the second x stands for hexadecimal and prints the content of the location in hexadecimal format.

Here is a list of possible formats:

  • o, octal
  • x, hexadecimal
  • u, decimal (unsigned)
  • t, binary
  • i, instruction
  • c, character
  • s, string

In the assembly of our program there is an explicit reference to a memory address:

   0x0804846a <+32>:    push   0x8048530

We inspect the content of that address using the different formats:

(gdb) x/o 0x8048530
0x8048530:      010062045
(gdb) x/x 0x8048530                                                                                      
0x8048530:      0x00206425
(gdb) x/u 0x8048530
0x8048530:      2122789
(gdb) x/t 0x8048530
0x8048530:      00000000001000000110010000100101
(gdb) x/i 0x8048530
   0x8048530:   and    eax,0x1002064
(gdb) x/c 0x8048530
0x8048530:      37 '%'
(gdb) x/s 0x8048530
0x8048530:      "%d "

Notice that the last one (string) returns "%d " which is in fact the format string of the printf! This instruction is pushing on the stack the (address of the) string before the call to printf. All the other cases show that 4 bytes string in different formats. For example the hexadecimal 0x00206425 is 0x00 (string termination) 0x20 (space) 0x64 (character d) and 0x25 (character %). So basically it is " d%" the string in the wrong order. This is due to the "endianess" of x86 processors: words are stored in little-endian byte-order (the least-significant first).

Before the format we can specify the number of elements to show. For example x/1i shows one instruction while x/4i shows four instructions. We exemplify below. Notice that we can refer to the content of a register using $name_of_register:

(gdb) x/1i $eip
=> 0x804845b : mov    DWORD PTR [ebp-0xc],0x0
(gdb) x/4i $eip
=> 0x804845b : mov    DWORD PTR [ebp-0xc],0x0
   0x8048462 : jmp    0x804847b 
   0x8048464 : sub    esp,0x8
   0x8048467 : push   DWORD PTR [ebp-0xc]

Finally, after the format we can optionally specify the dimension: b byte, h (halfword) 2 bytes, w 4 bytes, g 8 bytes. In the example below we print 8 bytes in hexadecimal format starting from register eip

(gdb) x/8xb $eip
0x804845b :    0xc7    0x45    0xf4    0x00    0x00    0x00    0x00    0xeb

Step-by-step execution

Current instruction stores value 0x0 in the word pointed by ebp-0xc, that we deduce to be the position on the stack of integer variable i.

(gdb) x/1i $eip
=> 0x804845b : mov    DWORD PTR [ebp-0xc],0x0

We examine the value in position ebp-0xc, we execute one instruction with command nexti and we finally check again the value. Notice the usage of $ before the register name:

(gdb) x/xw $ebp-0xc
0xbffff9dc:     0x00000001
(gdb) nexti
0x08048462      5               for (i=0; i<10; i++)
(gdb) x/xw $ebp-0xc
0xbffff9dc:     0x00000000

We can see that the value of i has been set to 0 which is the i=0 command in the C code.
Let us see what is the current instruction:

(gdb) x/1i $eip                                                                                          
=> 0x8048462 : jmp    0x804847b 

The for loop is implemented by jumping to the program point 0x804847b. We perform the jump and we inspect the next two instructions:

(gdb) nexti
0x0804847b      5               for (i=0; i<10; i++)
(gdb) x/2i $eip
=> 0x804847b : cmp    DWORD PTR [ebp-0xc],0x9
   0x804847f : jle    0x8048464 

The above instructions compare i (ebp-0xc) with value 9 and, if the value in i is less than or equal 9 the program jumps to <main+26>. Since i is 0 the program jumps. After two more nexti we are in the following situation:

(gdb) nexti
0x0804847f      5               for (i=0; i<10; i++)
(gdb) nexti
6                       printf("%d ",i);
(gdb) x/6i $eip                                                                                          
=> 0x8048464 : sub    esp,0x8
   0x8048467 : push   DWORD PTR [ebp-0xc]
   0x804846a : push   0x8048530
   0x804846f : call   0x8048310 
   0x8048474 : add    esp,0x10
   0x8048477 : add    DWORD PTR [ebp-0xc],0x1

The next 4 instructions perform the call to printf by first pushing i and "%d " (address 0x8048530) on the top of the stack and then calling the function. The sixth instruction is the increment of i.
We can execute all of the 6 instruction by issuing nexti 6 and this leads the program again in the comparison code but now the value of i is 1:

(gdb) nexti 6
0x0804847b      5               for (i=0; i<10; i++)
(gdb) x/2i $eip
=> 0x804847b : cmp    DWORD PTR [ebp-0xc],0x9
   0x804847f : jle    0x8048464 
(gdb) x/1xw $ebp-0xc
0xbffff9dc:     0x00000001

After eigth more instructions we will have that i is 2 and so on:

(gdb) nexti 8
0x0804847b      5               for (i=0; i<10; i++)
(gdb) x/2i $eip                                                                                          
=> 0x804847b : cmp    DWORD PTR [ebp-0xc],0x9
   0x804847f : jle    0x8048464 
(gdb) x/1xw $ebp-0xc                                                                                     
0xbffff9dc:     0x00000002
(gdb) nexti 8
0x0804847b      5               for (i=0; i<10; i++)
(gdb) x/1xw $ebp-0xc                                                                                     
0xbffff9dc:     0x00000003
(gdb) nexti 8                                                                                            
0x0804847b      5               for (i=0; i<10; i++)
(gdb) x/1xw $ebp-0xc
0xbffff9dc:     0x00000004

By issuing continue (abbreviated as c) we can complete execution:

(gdb) c
Continuing.
0 1 2 3 4 5 6 7 8 9 
[Inferior 1 (process 12123) exited with code 012]

Program analysis

gdb can be used to analyse executable programs. We illustrate a simple analysis of a program that checks a password. The program, unfortunately, does not protect the password using cryptographic hashes (as should always be done) making it possible to recover it.

The executable program is available here and can be also found as /opt/shared/gdb/password on testbed. After you have watched the demo try yourself the attack!

Exercise

Try to find Darth Vader's email password. The client is available on testbed at /opt/shared/gdb/vadermail.

Resources