CS 270 Project 3: Buffer Overflow Bugs


This assignment, produced by the authors of our textbook, involves generating a total of five attacks on two programs having different security vulnerabilities. It will help you develop a detailed understanding of x86_64 calling conventions and stack organization.

This project gives you firsthand experience with different ways that attackers can exploit security vulnerabilities when programs do not safeguard themselves enough against buffer overflows. You will learn about the runtime operation of programs and understand this security weakness so you can avoid it when you write system code. We do not condone the use of this or any other form of attack to gain unauthorized access to any system resources. There are criminal statutes governing such activities.

As usual, this is an individual project.

Obtaining the files

You can obtain the files by pointing your Web browser at http://raphael.cs.uky.edu:19123/.

The server then builds your files and returns them to your browser in a file called targetk.tar, where k is the unique number of your target programs.

Save that file in a Linux directory in which you plan to do your work. Then give the command tar xvf targetk.tar. This command creates a directory called targetk/ containing several files.

You should only download one set of files. If for some reason you download multiple targets, choose one target to work on and delete the rest.

The files in the new directory are:

In the following instructions, we assume that you have copied the files to a local directory and that you are executing them in that local directory.


  1. You must do the assignment on your class virtual machine in order to get credit.
  2. Your solutions may not use attacks to circumvent the validation code in the programs. Specifically, any address you incorporate into an attack string for use by a ret instruction should be to one of the following destinations:
    1. The addresses for functions touch1, touch2, or touch3.
    2. The address of your injected code.
    3. The address of one of your gadgets from the gadget farm.
  3. You may only construct gadgets from rtarget with addresses ranging between those for functions start_farm and end_farm.

Target Programs

Both ctarget and rtarget read strings from standard input. They do so with the function getbuf() defined below:
  int getbuf() {
      char buf[BUFFER_SIZE];
      return 1;

The function Gets() is similar to the standard library function gets(): It reads a string from standard input (terminated by \n or end-of-file) and stores it (along with a null terminator) at the specified destination. In this code, you can see that the destination is an array buf having BUFFER_SIZE bytes. The number of bytes is specific to your copy of the program.

Gets(), like gets(), grabs a string from the input stream and stores it into its destination address (in this case buf). However, Gets() has no way of determining whether buf is large enough to store the whole input. It simply copies the entire input string, possibly overrunning the bounds of the storage allocated at the destination.

If the string typed by the user to getbuf() is short enough, then getbuf() returns 1, as shown by the following execution example:

    unix> ./ctarget
Cookie: 0x76995469
Type string: I love cs270.
No exploit. Getbuf returned 0x1
Normal return

Typically an error occurs if you enter a longer string:

    unix> ./ctarget
Cookie: 0x76995469
Type string: It is easier to love this class when you finish all assignments on time.
Ouch!: You caused a segmentation fault!
Better luck next time

The rtarget program has similar behavior. As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed these programs so they do more interesting things. These strings are called exploit strings.

Both programs take several different command-line parameters:

-h Print a list of valid command-line parameters.
-q Don't send results to the grading server
-i FILE Supply input from a file, rather than from standard input

Your exploit strings typically contain byte values that do not correspond to the ASCII values for printing characters. The program hex2raw can help you generate these raw strings. It is described in the writeup for Laboratory 4.

Important points

When you have correctly solved one of the levels, your target program automatically sends a notification to the grading server, which tests your exploit string and validates it. For example:
    unix> ./hex2raw < ctarget.l2.txt | ./ctarget
    Cookie: 0x1a7dd803
    Type string:Touch2!: You called touch2(0x1a7dd803)
    Valid solution for level 2 with target ctarget
    PASS: Sent exploit string to server to be validated.

Unlike the Binary Bomb project, this project was designed by Dr. Nice. There is no penalty for making mistakes in this project. Feel free to fire away at ctarget and rtarget with any string you like.

You can work on your solution on any x86_64 Linux machine, but in order to submit your solution, you need to be running on your virtual machine. Submit your solution simply by uploading a file with your name, your LinkBlue ID, and target number to https://www.cs.uky.edu/csportal/.

Project phases

Phase Program Level Method Function Points
1 ctarget 1 Code injection touch1 10
2 ctarget 2 Code injection touch2 25
3 ctarget 3 Code injection touch3 25
4 rtarget 2 Return-oriented programming touch2 35
5 rtarget 3 Return-oriented programming touch3 5

Level 1

For Phase 1, you do not inject new code. Instead, your exploit string redirects the program to execute an existing procedure.

Function getbuf is called within ctarget by a function test having the following C code:

void test() {
    int val;
    val = getbuf(); 
    printf("No exploit.  Getbuf returned 0x%x\n", val);
When getbuf executes its return statement (line 5 of getbuf), the program ordinarily resumes execution within function test (at line 5 of this function). We want to change this behavior. Within the file ctarget, there is code for a function touch1 having the following C representation:
void touch1() {
    vlevel = 1;       /* Part of validation protocol */
    printf("Touch1!: You called touch1()\n");
Your task is to get ctarget to execute the code for touch1 when getbuf executes its return statement, rather than returning to test. Your exploit string may also corrupt parts of the stack not directly related to this stage, but this corruption does not cause a problem, because touch1 does not return; it causes the program to exit directly.

Some Advice

  1. Most of the information you need to devise your exploit string for this level can be determined by examining a disassembled version of ctarget. Use objdump to get this disassembled version.
  2. However, objdump does not give you the actual addresses where code ends up when the program runs. You might use gdb to start the program, then use its disassemble command to discover the actual addresses where items are stored.
  3. Even worse, Linux tries to prevent the sorts of attacks you will use by randomizing the locations of code. You need to disable this feature:
    echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
    If you reboot your VM, you need to disable this feature again.
  4. The idea is to position a byte representation of the starting address for touch1 so that the ret instruction at the end of the code for getbuf transfers control to touch1.
  5. Be careful about byte ordering.
  6. You might want to verify the output of hex2raw with a command like hex2raw < hexdata.txt | od -t c1 .
  7. You might want to use gdb to step the program through the last few instructions of getbuf to make sure it is doing what you expect.
  8. The placement of buf within the stack frame for getbuf depends on the value of compile-time constant BUFFER_SIZE, as well the allocation strategy used by gcc. You need to examine the disassembled code to determine its position.

Level 2

Phase 2 involves injecting a small amount of code as part of your exploit string. Within ctarget there is code for a function touch2 having the following C representation:
void touch2(unsigned val) {
    vlevel = 2;       /* Part of validation protocol */
    if (val == cookie) {
        printf("Touch2!: You called touch2(0x%.8x)\n", val);
    } else {
        printf("Misfire: You called touch2(0x%.8x)\n", val);
Your task is to get ctarget to execute the code for touch2 rather than returning to test. In this case, however, you must make it appear to touch2 as if you have passed your cookie as its parameter.

Some Advice

  1. You want to position a byte representation of the address of your injected code in such a way that ret instruction at the end of the code for getbuf transfers control to it.
  2. gcc passes the first parameter to a function in register %rdi.
  3. Your injected code should set the %rdi register to your cookie and then use a ret instruction to transfer control to the first instruction in touch2.
  4. Do not attempt to use jmp or call instructions in your exploit code. The encodings of destination addresses for these instructions are difficult to formulate. Use ret instructions for all transfers of control, even when you are not returning from a call.
  5. Here is how to generate byte-level representations of instruction sequences:
    1. Put your instruction sequence (in assembler language) in a file with a name ending in .s, such as sequence.s.
    2. Run gcc on the file this way:
      	gcc -c sequence.s 
    3. The result is in sequence.o. Use objdump to see the result:
      	objdump -d sequence.o > sequence.txt

Level 3

Phase 3 also involves a code-injection attack, but passing a string as parameter. Within ctarget there is code for functions hexmatch and touch3 having the following C representations:
/* Compare string to hex represention of unsigned value */
int hexmatch(unsigned val, char *sval) {
    char cbuf[110];
    /* Make position of check string unpredictable */
    char *s = cbuf + random() % 100;
    sprintf(s, "%.8x", val);
    return strncmp(sval, s, 9) == 0;

void touch3(char *sval) {
    vlevel = 3;       /* Part of validation protocol */
    if (hexmatch(cookie, sval)) {
        printf("Touch3!: You called touch3(\"%s\")\n", sval);
    } else {
        printf("Misfire: You called touch3(\"%s\")\n", sval);
Your task is to get ctarget to execute the code for touch3 rather than returning to test. You must make it appear to touch3 as if you have passed a string representation of your cookie as its parameter.

Some Advice

  1. You need to include a string representation of your cookie in your exploit string. The string should consist of the eight hexadecimal digits (ordered from most to least significant) without a leading "0x."
  2. A string is represented in C as a sequence of bytes followed by a byte with value 0. Type man ascii on any Linux machine to see the byte representations of the characters you need. For example, the ASCII code for the number 0 is 30 (in hex); the ASCII code for a is 61.
  3. Your injected code should set register %rdi to the address of this string.
  4. When functions hexmatch and strncmp are called, they push data onto the stack, overwriting portions of memory that held the buffer used by getbuf. As a result, you need to be careful where you place the string representation of your cookie.

Part II: Return-Oriented Programming

Performing code-injection attacks on program rtarget is much more difficult than it is for ctarget, because it uses two techniques to thwart such attacks.
  1. It randomizes the position of the stack, so stack positions differ from one run to another. It is impossible to determine a fixed address where your injected code should be located.
  2. It marks the section of memory holding the stack as non-executable, so even if you could jump to the start of your injected code, the program would fail with a segmentation fault.

Fortunately, clever people have devised strategies for getting useful things done in a program by executing existing code, rather than injecting new code. The most general form of this technique is called return-oriented programming (ROP). The ROP strategy is to identify byte sequences within an existing program that consist of one or more instructions followed by the instruction ret. Such a segment is referred to as a gadget. The figure below illustrates how the stack can be set up to execute a sequence of n gadgets.


In this figure, the stack contains a sequence of gadget addresses. Each gadget consists of a series of instruction bytes, with the final one being 0xc3, which encodes the ret instruction. When the program executes a ret instruction starting with this configuration, it initiates a chain of gadget executions. The ret instruction at the end of each gadget causes the program to jump to the beginning of the next.

A gadget can make use of code corresponding to assembler-language statements generated by the compiler, especially ones at the ends of functions. In practice, there may be some useful gadgets of this form, but not enough to implement many important operations. For example, it is highly unlikely that a compiled function would have popq %rdi as its last instruction before ret. Fortunately, with a byte-oriented instruction set, such as in the x86_64, a gadget can often be found by extracting patterns from other parts of the instruction byte sequence. For example, one version of rtarget contains code generated for the following C function:

void setval_210(unsigned *p) {
    *p = 3347663060U;
The chances of this function being useful for attacking a system seem pretty slim. But the disassembled machine code for this function shows an interesting byte sequence:
0000000000400f15 <setval_210>:
  400f15:       c7 07 d4 48 89 c7       movl   $0xc78948d4,(%rdi)
  400f1b:       c3                      retq   
The byte sequence 48 89 c7 encodes the instruction movq %rax, %rdi. (See below for the encodings of useful movq instructions.) This sequence is followed by byte value c3, which encodes the ret instruction. The function starts at address 0x400f15, and the sequence starts on the fourth byte of the function. Thus, this code contains a gadget, having a starting address of 0x400f18, which copies the 64-bit value in register %rax to register %rdi.

The code for rtarget contains a number of functions similar to the setval_210 function shown above in a region we refer to as the gadget farm. Your job is to identify useful gadgets in the gadget farm and use these to perform attacks similar to those you did in Phases 2 and 3.

Important: The gadget farm is demarcated by functions start_farm and end_farm in your copy of rtarget. Do not attempt to construct gadgets from other portions of the program code.


Encodings of movq instructions

movq S, D
Source Destination D
%rax48 89 c048 89 c148 89 c248 89 c348 89 c448 89 c548 89 c648 89 c7
%rcx48 89 c848 89 c948 89 ca48 89 cb48 89 cc48 89 cd48 89 ce48 89 cf
%rdx48 89 d048 89 d148 89 d248 89 d348 89 d448 89 d548 89 d648 89 d7
%rbx48 89 d848 89 d948 89 da48 89 db48 89 dc48 89 dd48 89 de48 89 df
%rsp48 89 e048 89 e148 89 e248 89 e348 89 e448 89 e548 89 e648 89 e7
%rbp48 89 e848 89 e948 89 ea48 89 eb48 89 ec48 89 ed48 89 ee48 89 ef
%rsi48 89 f048 89 f148 89 f248 89 f348 89 f448 89 f548 89 f648 89 f7
%rdi48 89 f848 89 f948 89 fa48 89 fb48 89 fc48 89 fd48 89 fe48 89 ff

Encodings of popq instructions

Operation Register R
popq R58595a5b5c5d5e5f

Encodings of movl instructions

movl S, D
Source Destination D
%eax89 c089 c189 c289 c389 c489 c589 c689 c7
%ecx89 c889 c989 ca89 cb89 cc89 cd89 ce89 cf
%edx89 d089 d189 d289 d389 d489 d589 d689 d7
%ebx89 d889 d989 da89 db89 dc89 dd89 de89 df
%esp89 e089 e189 e289 e389 e489 e589 e689 e7
%ebp89 e889 e989 ea89 eb89 ec89 ed89 ee89 ef
%esi89 f089 f189 f289 f389 f489 f589 f689 f7
%edi89 f889 f989 fa89 fb89 fc89 fd89 fe89 ff

Encodings of 2-byte functional nop instructions

OperationRegister R
andb R, R20 c020 c920 d220 db
orb R, R08 c008 c908 d208 db
cmpb R, R38 c038 c938 d238 db
testbR, R84 c084 c984 d284 db

Level 2

For Phase 4, you repeat the attack of Phase 2, but you do so on program rtarget using gadgets from your gadget farm. You can construct your solution using gadgets consisting of the following instruction types, and using only the first eight x86_64 registers (%rax%rdi).

Some Advice

  1. All the gadgets you need can be found in the region of the code for rtarget demarcated by the functions start_farm and mid_farm.
  2. You can do this attack with just two gadgets.
  3. When a gadget uses a popq instruction, it pops data from the stack. As a result, your exploit string contains a combination of gadget addresses and data.

Level 3

Before you take on Phase 5, pause to consider what you have accomplished so far. In Phases 2 and 3, you caused a program to execute machine code of your own design. If ctarget had been a network server, you could have injected your own code into a distant machine. In Phase 4, you circumvented two of the main devices modern programs use to thwart buffer overflow attacks. Although you did not inject your own code, you were able inject a type of program that operates by stitching together sequences of existing code. You have also gotten 95/100 points for the project. That's a good score. If you have other pressing obligations consider stopping right now.

Phase 5 requires you to do an ROP attack on rtarget to invoke function touch3 with a pointer to a string representation of your cookie. That may not seem significantly more difficult than using an ROP attack to invoke touch2, except that we have made it so. Moreover, Phase 5 counts for only 5 points, which is not a true measure of the effort it requires. Think of it as more an extra credit problem for those who want to go beyond the normal expectations for the course.

To solve Phase 5, you can use gadgets in the region of the code in rtarget demarcated by functions start_farm and end_farm. In addition to the gadgets used in Phase 4, this expanded farm includes the encodings of different movl instructions, as shown above. The byte sequences in this part of the farm also contain 2-byte instructions that serve as functional nops, that is, they do not change any register or memory values. These include instructions, shown above, such as andb %al,%al, that operate on the low-order bytes of some of the registers but do not change their values.

Some Advice

  1. Review the effect a movl instruction has on the upper 4 bytes of a register, as is described on page 183 of the text.
  2. The official solution requires eight gadgets (not all of which are unique).
Good luck and have fun!