The gdb
debugger allows us to observe and analyse program execution step-by-step. We illustrate, through examples, the main functionalities of gdb.
A simple example
Consider again our simple C program.
#include
int main()
{
int i;
for (i=0; i<10; i++)
printf("%d ",i);
printf("\n");
}
If we compile it with the -g option we have additional information on the source code, directly available from gdb.
testbed@testbed:~/lab1$ gcc -g hello.c -o hello
Listing and disassembling
We can now run gdb and issue command list
to show the source code (thanks to the -g option given to gcc)
r1x@testbed ~/gdb $ gdb -q hello
Reading symbols from hello...done.
(gdb) list
1 #include
2 int main()
3 {
4 int i;
5 for (i=0; i<10; i++)
6 printf("%d ",i);
7 printf("\n");
8 }
(gdb)
We now want to disassemble the program. Remember we are using the Intel syntax for assembly. In gdb we can set Intel syntax as follows:
(gdb) set disassembly intel
NOTE: This can be made the default syntax by executing the following (in the home directory) so that we don't have to issue set disassembly intel
all the time:
echo "set disassembly intel" > ~/.gdbinit
By issuing disassemble main
we can now show the assembly code of function main.
(gdb) disassemble main
Dump of assembler code for function main:
0x0804844a <+0>: lea ecx,[esp+0x4]
0x0804844e <+4>: and esp,0xfffffff0
0x08048451 <+7>: push DWORD PTR [ecx-0x4]
0x08048454 <+10>: push ebp
0x08048455 <+11>: mov ebp,esp
0x08048457 <+13>: push ecx
0x08048458 <+14>: sub esp,0x14
0x0804845b <+17>: mov DWORD PTR [ebp-0xc],0x0
0x08048462 <+24>: jmp 0x804847b
0x08048464 <+26>: sub esp,0x8
0x08048467 <+29>: push DWORD PTR [ebp-0xc]
0x0804846a <+32>: push 0x8048530
0x0804846f <+37>: call 0x8048310
0x08048474 <+42>: add esp,0x10
0x08048477 <+45>: add DWORD PTR [ebp-0xc],0x1
0x0804847b <+49>: cmp DWORD PTR [ebp-0xc],0x9
0x0804847f <+53>: jle 0x8048464
0x08048481 <+55>: sub esp,0xc
0x08048484 <+58>: push 0xa
0x08048486 <+60>: call 0x8048340
0x0804848b <+65>: add esp,0x10
0x0804848e <+68>: mov ecx,DWORD PTR [ebp-0x4]
0x08048491 <+71>: leave
0x08048492 <+72>: lea esp,[ecx-0x4]
0x08048495 <+75>: ret
End of assembler dump.
Breakpoints
gdb allows for defining breakpoints: points in the program where execution will stop so that we can inspect the state. This can be done, for example, by issuing break main
which will put a breakpoint at the beginning of main function code.
(gdb) break main
Breakpoint 1 at 0x804845b: file hello.c, line 5.
We note that breakpoint is set to instruction main+17
. Previous instructions are the function prologue that initialize the stack (these can vary depending on the compiler).
NOTE: To set a breakpoint at a specific address you need to use the syntax break *addr
. For example break *0x804845b
.
We can now run the program by issuing run
(gdb) run
Starting program: /home/r1x/gdb/hello
Breakpoint 1, main () at hello.c:5
5 for (i=0; i<10; i++)
Once we are in the middle of a program execution we can inspect the state of processor and memory as described below.
Inspecting registers
Registers can be inspected by issuing info registers
(gdb) info registers
eax 0x1 1
ecx 0xbffffa00 -1073743360
edx 0xbffffa24 -1073743324
ebx 0xb7fcf000 -1208160256
esp 0xbffff9d0 0xbffff9d0
ebp 0xbffff9e8 0xbffff9e8
esi 0x0 0
edi 0x8048350 134513488
eip 0x804845b 0x804845b
eflags 0x282 [ SF IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
We have illustrated registers in the previous class.
We observe that the instruction pointer eip
contains value 0x804845b
which is, in fact, where we placed the breakpoint. So the execution has stopped just before the first instruction of our C program, as expected.
NOTE: We can inspect a single register by putting its name after info register
. For example:
(gdb) info register eip
eip 0x804845b 0x804845b
Examining memory
We can examine memory locations using the 'examine' (or just 'x') command. When we issue this command we can specify a format separated by /
. For example, if we write x/x
the second x
stands for hexadecimal and prints the content of the location in hexadecimal format.
Here is a list of possible formats:
- o, octal
- x, hexadecimal
- u, decimal (unsigned)
- t, binary
- i, instruction
- c, character
- s, string
In the assembly of our program there is an explicit reference to a memory address:
0x0804846a <+32>: push 0x8048530
We inspect the content of that address using the different formats:
(gdb) x/o 0x8048530
0x8048530: 010062045
(gdb) x/x 0x8048530
0x8048530: 0x00206425
(gdb) x/u 0x8048530
0x8048530: 2122789
(gdb) x/t 0x8048530
0x8048530: 00000000001000000110010000100101
(gdb) x/i 0x8048530
0x8048530: and eax,0x1002064
(gdb) x/c 0x8048530
0x8048530: 37 '%'
(gdb) x/s 0x8048530
0x8048530: "%d "
Notice that the last one (string) returns "%d " which is in fact the format string of the printf! This instruction is pushing on the stack the (address of the) string before the call to printf. All the other cases show that 4 bytes string in different formats. For example the hexadecimal 0x00206425
is 0x00 (string termination) 0x20 (space) 0x64 (character d) and 0x25 (character %). So basically it is " d%" the string in the wrong order. This is due to the "endianess" of x86 processors: words are stored in little-endian byte-order (the least-significant first).
Before the format we can specify the number of elements to show. For example x/1i shows one instruction while x/4i shows four instructions. We exemplify below. Notice that we can refer to the content of a register using $name_of_register:
(gdb) x/1i $eip
=> 0x804845b : mov DWORD PTR [ebp-0xc],0x0
(gdb) x/4i $eip
=> 0x804845b : mov DWORD PTR [ebp-0xc],0x0
0x8048462 : jmp 0x804847b
0x8048464 : sub esp,0x8
0x8048467 : push DWORD PTR [ebp-0xc]
Finally, after the format we can optionally specify the dimension: b byte, h (halfword) 2 bytes, w 4 bytes, g 8 bytes. In the example below we print 8 bytes in hexadecimal format starting from register eip
(gdb) x/8xb $eip
0x804845b : 0xc7 0x45 0xf4 0x00 0x00 0x00 0x00 0xeb
Step-by-step execution
Current instruction stores value 0x0 in the word pointed by ebp-0xc
, that we deduce to be the position on the stack of integer variable i
.
(gdb) x/1i $eip
=> 0x804845b : mov DWORD PTR [ebp-0xc],0x0
We examine the value in position ebp-0xc
, we execute one instruction with command nexti
and we finally check again the value. Notice the usage of $ before the register name:
(gdb) x/xw $ebp-0xc
0xbffff9dc: 0x00000001
(gdb) nexti
0x08048462 5 for (i=0; i<10; i++)
(gdb) x/xw $ebp-0xc
0xbffff9dc: 0x00000000
We can see that the value of i
has been set to 0 which is the i=0
command in the C code.
Let us see what is the current instruction:
(gdb) x/1i $eip
=> 0x8048462 : jmp 0x804847b
The for loop is implemented by jumping to the program point 0x804847b
. We perform the jump and we inspect the next two instructions:
(gdb) nexti
0x0804847b 5 for (i=0; i<10; i++)
(gdb) x/2i $eip
=> 0x804847b : cmp DWORD PTR [ebp-0xc],0x9
0x804847f : jle 0x8048464
The above instructions compare i (ebp-0xc) with value 9 and, if the value in i is less than or equal 9 the program jumps to <main+26>. Since i is 0 the program jumps. After two more nexti we are in the following situation:
(gdb) nexti
0x0804847f 5 for (i=0; i<10; i++)
(gdb) nexti
6 printf("%d ",i);
(gdb) x/6i $eip
=> 0x8048464 : sub esp,0x8
0x8048467 : push DWORD PTR [ebp-0xc]
0x804846a : push 0x8048530
0x804846f : call 0x8048310
0x8048474 : add esp,0x10
0x8048477 : add DWORD PTR [ebp-0xc],0x1
The next 4 instructions perform the call to printf by first pushing i and "%d " (address 0x8048530) on the top of the stack and then calling the function. The sixth instruction is the increment of i.
We can execute all of the 6 instruction by issuing nexti 6
and this leads the program again in the comparison code but now the value of i is 1:
(gdb) nexti 6
0x0804847b 5 for (i=0; i<10; i++)
(gdb) x/2i $eip
=> 0x804847b : cmp DWORD PTR [ebp-0xc],0x9
0x804847f : jle 0x8048464
(gdb) x/1xw $ebp-0xc
0xbffff9dc: 0x00000001
After eigth more instructions we will have that i is 2 and so on:
(gdb) nexti 8
0x0804847b 5 for (i=0; i<10; i++)
(gdb) x/2i $eip
=> 0x804847b : cmp DWORD PTR [ebp-0xc],0x9
0x804847f : jle 0x8048464
(gdb) x/1xw $ebp-0xc
0xbffff9dc: 0x00000002
(gdb) nexti 8
0x0804847b 5 for (i=0; i<10; i++)
(gdb) x/1xw $ebp-0xc
0xbffff9dc: 0x00000003
(gdb) nexti 8
0x0804847b 5 for (i=0; i<10; i++)
(gdb) x/1xw $ebp-0xc
0xbffff9dc: 0x00000004
By issuing continue
(abbreviated as c) we can complete execution:
(gdb) c
Continuing.
0 1 2 3 4 5 6 7 8 9
[Inferior 1 (process 12123) exited with code 012]
Program analysis
gdb can be used to analyse executable programs. We illustrate a simple analysis of a program that checks a password. The program, unfortunately, does not protect the password using cryptographic hashes (as should always be done) making it possible to recover it.
The executable program is available here and can be also found as /opt/shared/gdb/password
on testbed. After you have watched the demo try yourself the attack!
Exercise
Try to find Darth Vader's email password. The client is available on testbed at /opt/shared/gdb/vadermail
.
Resources
- gdb home page
- gdb cheat sheet
- the official gdb cheat sheet of c00kies@venice
- peda (available on testbed by issuing
source /opt/shared/peda/peda.py
in gdb)