The term program exploitation refers to techniques that allow us to “make a program do something unexpected and not planned”. To make use of these techniques is essential to understand how programs are compiled and executed, and to use tools for controlled execution (debuggers).
NOTE: All the proposed exercises can be tried on a fairly standard Linux 32-bit system. We have also created an account on a ‘testbed’ host where you can perform all the proposed exercises. More information can be found in the home page of the course.
The Assembly language
Executable programs are written in the machine code, which may vary depending on the computer architecture. The Assembly language makes such low level code more readable. In this Lab we will focus on x86 Assembly whose reference manuals can be found here.
Consider this simple C program:
for (i=0; i<10; i++)
You can copy and paste it into a text editor and save it with name
hello.c. Then, compile and execute it from a terminal as follows:
r1x@testbed ~ $ gcc hello.c -o hello
r1x@testbed ~ $ ./hello
0 1 2 3 4 5 6 7 8 9
As expected, the program prints integers from 0 to 9 separated by a space.
We can now dump its Assembly code using the following command:
r1x@testbed ~ $ objdump -M intel -D hello | grep -A20 main
804844a: 8d 4c 24 04 lea ecx,[esp+0x4]
804844e: 83 e4 f0 and esp,0xfffffff0
8048451: ff 71 fc push DWORD PTR [ecx-0x4]
8048454: 55 push ebp
8048455: 89 e5 mov ebp,esp
8048457: 51 push ecx
8048458: 83 ec 14 sub esp,0x14
804845b: c7 45 f4 00 00 00 00 mov DWORD PTR [ebp-0xc],0x0
8048462: eb 17 jmp 804847b <main+0x31>
8048464: 83 ec 08 sub esp,0x8
8048467: ff 75 f4 push DWORD PTR [ebp-0xc]
804846a: 68 30 85 04 08 push 0x8048530
804846f: e8 9c fe ff ff call 8048310 <printf@plt>
8048474: 83 c4 10 add esp,0x10
8048477: 83 45 f4 01 add DWORD PTR [ebp-0xc],0x1
804847b: 83 7d f4 09 cmp DWORD PTR [ebp-0xc],0x9
804847f: 7e e3 jle 8048464 <main+0x1a>
8048481: 83 ec 0c sub esp,0xc
8048484: 6a 0a push 0xa
8048486: e8 b5 fe ff ff call 8048340 <putchar@plt>
804848b: 83 c4 10 add esp,0x10
804848e: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]
8048491: c9 leave
8048492: 8d 61 fc lea esp,[ecx-0x4]
8048495: c3 ret
-M intelshows the Assembly in the Intel syntax instead of the AT&T one (we will use this syntax to be coherent with the textbook “Hacking, the art of exploitation”).
In the Intel syntax, an assembly instruction for data manipulation has the following general form:
1command <dst>, <src>
0x8048520(an address, in fact) into register
-Dstands for Disassemble
- Since we want to inspect the code of the
mainand we print 20 lines after it. Command
grepfilters what is given as input and prints only the lines that contain the specified string. Option
-A20prints 20 lines after the one containing string
Assembly directly refers to the processor registers. In this code we can see the Accumulator
eax and the two registers
esp, ebp delimiting the stack: stack and base pointers. Recall that variables declared inside functions are stored on the stack. Their position is relative to one of these two registers. In this case
DWORD PTR is used to indicate that the following address enclosed in square brackets is a pointer to a 32-bits number. For example:
mov DWORD PTR [ebp-0xc],0x0
moves the 32-bit representation of 0x0 into the address
ebp-0xc, i.e., the point on the stack which is located
0xc bytes above the base of the stack. This is in fact the initialisation of variable
i to value
0 in the C source code.
Exercise: Try to understand the above Assembly code pointing out the interesting instructions (such as variable initializations, tests, function calls, …).
Modifying the executable program
On the left of the assembly instructions we notice the actual hex bytes corresponding to each instruction. Those bytes are the actual binary code that is executed by the processor. We can use hex editors to directly modify binary code. In the following video we show how to make the above program print only even numbers without recompiling it. We look for specific bytes corresponding to the increment and we modify the increment from 0x1 to 0x2. After you have seen the demo you can try yourself even more sophisticated changes, like the one proposed in the exercise.
Exercise: Try to change the original executable code so to loop when
i is strictly less then 9 and not less the or equal to 9. To achieve this you have to change the actual opcode of the assembly instruction (from
jl). See for example here for a quick reference to x86 opcodes.