domingo, 16 de julio de 2017

From fuzzing Apache httpd server to CVE-2017-7668 and a $1500 bounty

Intro

In the previous post I thoroughly described how to fuzz Apache's httpd server with American Fuzzy Lop. After writing that post and, to my surprise, got a few crashing test cases. I say "to my surprise" because anybody who managed to get some good test cases could have done it before me and, despite of it, I was the first in reporting such vulnerability. So here's the blog of it!

Goal

After seeing Apache httpd server crashing under AFL, lots of problems arise such as, the crashing tests doesn't crash outside of the fuzzer, the stability of the fuzzed program goes way down, etc. In this blog post we will try to give an explanation to such happenings while showing how to get the bug and, finally, we will shed some light on the crash itself.

Takeaways for the reader

  • Testcases scraped from wikipedia
  • Bash-fu Taos
  • Valgrind for triage
  • Valgrind + gdb: Learn to not always trust Valgrind
  • rr

The test cases

Since this was just a testing case for myself to fuzz network based programs with AFL, I did not bother too much on getting complex or test cases that had a lot of coverage.
So, in order to get a few test cases that would cover a fair amount of a vanilla installation of Apache's httpd server, I decided to look up an easy way to scrape all the headers from the List of headers - WIki Page.

Bash-fu Tao 1: Butterfly knife cut

The first thing I did is just copy paste the two tables under Request Fields into a text file with your editor of choice. It is important that your editor of choice doesn't replace tabs for spaces or the cut command will lose all its power. I chose my file to be called "wiki-http-headers" and after populating it, we select the third column of the tables we can do the following. Remember that the default delimiter for cut is the TAB character:

cat wiki-http-headers | cut -f3 | grep ":" | sed "s#Example....##g" | sort -u

We can see that some headers are gone such as the TSV header. I ignored such and went on to fuzzing since coverage was not my concern - the real goal was to fuzz. Maybe you can find new 0-days with the missing headers! Why not? ;)

Bash-fu Tao 2: Chain punching with "for"

Now that we have learned our first Tao, it is time to iterate on each header and create a test case per line. Avid bash users will already know how to do this but for these newcomers and also learners here's how:

a=0 && IFS=$'\n' && for header in $(cat wiki-http-headers | cut -f3 | grep ":" | sort -u); do echo -e "GET / HTTP/1.0\r\n$header\r\n\r\n" > "testcase$a.req";a=$(($a+1)); done && unset IFS

Let me explain such an abomination quickly. There is a thing called the Internal Field Separator (IFS) which is an environment variable holding the tokens that delimit fields in bash. The IFS by default in bash considers the space, the tab and the newline. Those separators will interfere with headers when encountering spaces because the for command in bash iterates over a given list of fileds (fields are separated by the IFS) - this is why we need to set the IFS to just the newline. Now we are ready to just iterate and echo each header to a different file (the a variable helps to dump each header to a file with a different name).

Bash-fu Tao Video

Here is one way to approach the full bash-fu Taos:

The fuzzing

Now that we have gathered a fair amount of (rather basic) test cases we can start now our fuzzing sessions. This section is fairly short as everything on how to fuzz Apache httpd is explained in the previous post. However, there minimal steps are:
  1. Download apr, apr-utils, nghttpd2, pcre-8 and Apache httpd 2.4.25
  2. Install the following dependencies:
    1. sudo apt install pkg-config
    2. sudo apt install libssl-dev
  3. Patch Apache httpd
  4. Compile with the appropriate flags and installation path (PREFIX environment variable)
Now it all should be ready and set up to start fuzzing Apache httpd. As you can see in the following video, with a bit of improved test cases the crash doesn't take long to come up:

It is worth mentioning that I cheated for this demo a bit as I introduced already a test case I knew it would make it crash "soon". How I obtained the crashing testcase was through a combination of honggfuzz, radamsa and AFL while checking the stability + "variable behaviour" folder of AFL.

The crashing

Dissapointment

First things first. When having a crashing test case it is mandatory to test if it is a false positive or not, right? Let's try it:
Euh... it doesn't crash outside Apache. What could be happening?

Troubleshooting

There are a few things to test against here...

- First of all we are fuzzing in persistent mode:
This means that maybe our test case did make the program crash but that it was one of many. In our case the __AFL_LOOP variable was set to over 9000 (a bit too much to be honest). For those that don't know what said variable is for, it is the number of fuzzing iterations that AFL will run before restarting the whole process. So, in the end, the crashing test case AFL discovered, would need to be launched in a worst case scenario, as follows: Launch all other non-crashing 8999 inputs and then launch the crashing one (i.e. the last test case) number 9000.

- The second thing to take into account is the stability that AFL reports:
The stability keeps going lower and lower. Usually (if you have read the readme from AFL you can skip this part) low stability could be to either, use of random values (or use of date functions, hint hint) in your code or usage of uninitialised memory. This is key to our bug.

- The third and last (and least in our case) would be the memory assigned to our fuzzed process:
In this case the memory is unlimited as we are running afl with "-m none" but in other cases it can be an indicator of overflows (stack or heap based) and access to unallocated memory.

Narrowing down the 9000

To test against our first assumption we need more crashing cases. To do so we just need to run AFL with our "crashing" test case only. It will take no time to find new paths/crashes which will help us narrow down our over 9000 inputs to a much lower value.

Now, onto our second assumption...

Relationship goals: Stability

When fuzzing, we could check that stability was going down as soon as AFL was getting more and more into crashing test cases - we can tell there is some kind of correlation between the crashes and memory. To test if we are actually using uninitialised memory we can use a very handy tool called...

Valgrind

Valgrind is composed by a set of instrumentation tools to do dynamic analysis of your programs. By default, it is set to run "memcheck", a tool to inspect memory management.
To install Valgrind on my Debian 8 I just needed to install it straight from the repositories:
sudo apt install valgrind
After doing that we need to run Apache server under Valgrind with:
NO_FUZZ=1 valgrind -- /usr/local/apache-afl-persist/bin/httpd -X
The NO_FUZZ environment variable is read by the code in the patch to prevent the fuzzing loop to kick in. After this we need to launch one of our "crashing" test cases into Apache server running under Valgrind and, hopefully, our second assumption about usage of uninitialised memory will be confirmed:

We can confirm that, yes, Apache httpd is making use of uninitialised values but, still... I wasn't happy that Apache won't crash so let's use our Bash-fu Tao 2 to iterate over each test case and launch it against Apache.

Good good, it's crashing now! We can now proceed to do some basic triage.

The triage

Let's do a quick analysis and see which (spoiler) header is the guilty one...

gdb + valgrind

One cool feature about valgrind is that, it will let you analyse the state of the of the program when an error occurs. We can do this through the --vgdb-error=1 flag. This flag will tell valgrind to stop execution on the first error reported and will wait for a debugger to attach to it. This is perfect for our case since it seems that we are accessing uninitialised values and accessing values outside of a buffer (out-of-bounds read) which is not a segfault but it still is an error under valgrind.
To use this feature, first we need to run in one shell:
NO_FUZZ=1 valgrind --vgdb-error=0 -- /usr/local/apache_afl_blogpost/bin/httpd -X
Then, in a second separate shell, we send our input that triggers the bug:
cat crashing-testcase.req | nc localhost 8080
Finally, in a third shell, we run gdb and attach through valgrind's command:
target remote | /usr/lib/valgrind/../../bin/vgdb
We are now inspecting what is happening inside Apache at the exact point of the error:

Figure 1 - Inspecting on first valgrind reported error.

As you can see the first reported error is on line 1693. Our intuition tells us it is going to be the variable s as it is being increased without any "proper" checks, apart from the *s instruction, which will be true unless it points to a null value. Since s is optimised out at compile time, we need to dive into the backtrace by going up one level and inspecting the conn variable which is the one that s will point to. It is left as an exercise for the reader as to why the backtrace shown by pwndbg is different than the one shown by the "bt" command.
For the next figures, keep in mind the 2 highlighted values on Figure 1: 0x6e2990c and 8749.


Here is where, for our analysis, the number from Figure 1, 8749, makes sense as we can see that the variable conn is allocated with 8192 bytes at 0x6e2990c. We can tell that something is wrong as 8749 is way far from the allocated 8192 bytes.


This is how we calculated the previous 8749 bytes. We stepped into the next error reported by valgrind through issuing the gdb "continue" command and letting it error out. There was an invalid read at 0x6e2bb39 and the initial pointer to the "conn" variable was at 0x6e2990c. Remember that s is optimized out so we need to do some math here as we can't get the real pointer from s on debugging time. Said this, we need to get the offset with:
invalid_read_offset = valgrind_error_pointer - conn
which is:
8749 = 0x6e2bb39 - 0x6e2990c

rr - Record & Replay Framework

During the process of the triage, one can find several happenings that can hinder the debugging process: Apache will stop out of nowhere (haven't managed to get the reason why), valgrind will make it crash on parts that it is not supposed to because of it adding its own function wrappers, the heap will be different on valgrind debugging sessions than plain gdb or vanilla runs, etc.
Here is where the Record & Replay Framework comes in handy: Deterministic replaying of the program's state. You can even Replay the execution backwards which, in our case, is totally awesome! Must say I discovered this tool thanks to a good friend and colleague of mine, Symeon Paraschoudis, who introduced this marvellous piece of software to me.
Let's cause the segmentation fault while recording with rr and replay the execution:

Full analysis is not provided as it is outside of the scope of this post.

Conclusions

We have learned how to use bash to effectively scrape stuff as test cases from the web and to believe that, even thought hundreds of people might be fuzzing a certain piece of software, we can still add our value when using the right combination of tools, mutations and knowledge.
Tools have been discovered along the way that will aid and help further triage.

Stay tuned for the search of animal 0day! Cross-posts from the SensePost blog upcoming with challenges on heap-exploitation!

Post-Scriptum

I am willing to donate the 1500$ bounty I received from the Internet Bug Bounty to any organisation related to kids schooling and, especially, those teaching and providing means regarding Information Technologies. Knowledge is Power! So tune in and leave your suggestions in the comment section below; I have thought of ComputerAid, any opinions on this?

10 comentarios:

  1. Great post! The Rural Technology Fund might fit what you're looking for in terms of donating the bug bounty. http://ruraltechfund.org/

    ResponderEliminar
  2. Great article! Thanks Javier, i have a question for you.. why did you use the persistent mode? I mean in the first post you used Robert's technique to unshare the resource (including the network interface and port) so you could fuzz from within the process (which i assume could be faster than fuzzing with persistence mode). I mean according to my understanding to AFL author blog https://lcamtuf.blogspot.com/2015/06/new-in-afl-persistent-mode.html?m=1, the persistence mode can make things work faster by improving the process forks, and this is according to his blog is suitable to some cases... i am not sure whether if persistence mode would be the best choice for fuzzing apache, please let me know what do you think. Thanks!

    ResponderEliminar
    Respuestas
    1. I don't quite get your question but I will try answering it.
      The persistent mode is definitely the way to go for long lived programs and heavy load programs - fuzzing Apache without persistence would run about 5-10 executions per second.
      The downside of persistent fuzzing, as mentioned in the previous post, is the lack of stability when the process makes use of some functions like date, random, etc.
      So yes, I only use persistent mode for fuzzing Apache (and whenever possible for other software). The "-F" option was just a Proof of Concept to go from the most simple option to understand the patch from Robert.

      Eliminar
    2. Thank you Javier, that does answer my question :)

      Eliminar
  3. Este comentario ha sido eliminado por el autor.

    ResponderEliminar
  4. Este comentario ha sido eliminado por el autor.

    ResponderEliminar
  5. Hello Javier, is it a bug if it crashes the target when compiled with clang (Illegal instruction), while it doesn't crash when compiled with gcc? Thanks.

    ResponderEliminar
    Respuestas
    1. Does it crash inside or outside of Valgrind? If it does within Valgrind it might be of its implementation and should be reported to Valgrind with the SIGILL report. On the other hand, if it is SIGILL when running a vanilla compilation of Apache's httpd and it crashed then it might be worth reporting to Apache's Security with the backtrace and as much information as possible.

      Eliminar