Over the past few weeks, for Google Summer of Code’s radare2 project, I have been working on the detection of variables as well as formal arguments that are stored on the stack, plus little hack to add those for fast call convention.
My first iteration was to let people do everything by hand: pick the calling convention, arguments, and parameters.
For this I basically added support to add variables and arguments relative to the base and the stack. At first, things were very x86-specific, but then I extended them to be more arch neutral, as long as the architecture has a stack and a base pointer. To see how to add arguments and variables, check the afa? command!
Albeit everything mentioned above works perfectly, it is of little use, since filling arguments and variables should be part of the automatic analysis process. By default, the comamnd
aa fills some, but it fails to recognize some variables, like the ones accessed by push, pop, or arithmetic operations only like this one. Note at adress
0x0804846a, analysis with
aa fails to recognize
ebp + 0x14 as
ebp + arg_14h
I decided to go a bit deeper than simply analyzing the disassembly listing of functions : I used the ESIL form. ESIL is meant to be used for emulation, but since I don’t fully understand how to use it with the emulator API at the time of writing this post, I went for text matching.
It may seem hack-ish, but it proved to be efficient and to capture all the corner cases. Basically, it looks for dereferences of the base pointer and stack pointer, then says “hey, that offset should be a variable” or “that offset should be an argument”, according to the relative offset to the base pointer: positive ones are variables, negative ones are arguments. The command responsible for doing that is
afCa, but if you want to know everything about it, you should
read the code check the
afc? command ;)
How to use (example)
This is little demonstration on what kind of information
afCa adds to the disassembly and you are encouraged to check for the other
afCx commands few of them are work in progress but the most are done, like guessing the calling convention is not done yet.