[fuzzing] Fwd: Commercial Fuzzer and Open-source Fuzzer comparison

Sergio 'shadown' Alvarez shadown at gmail.com
Tue Jul 7 19:52:51 UTC 2009


[I forgot to add the mailing list, sometimes pressing replay is not  
the right thing to do :P]

Begin forwarded message:

> From: Sergio 'shadown' Alvarez <shadown at gmail.com>
> Date: July 7, 2009 5:06:05 PM GMT+02:00
> To: Ari Takanen <ari.takanen at codenomicon.com>
> Subject: Re: [fuzzing] Commercial Fuzzer and Open-source Fuzzer  
> comparison
>
> Hi Ari,
>
> Answers inline:
>
> On Jul 7, 2009, at 9:27 AM, Ari Takanen wrote:
>
>> Hi all,
>>
>> Kind of FAQ material again, but here it goes...
>
> Well this is a fuzzing mailing list. You can expect anything here,  
> from basic questions to interesting techniques discussions. And of  
> course different thought about things, as well academic vs  
> researches terminologies to refer in the end to the same thing.
>
>> The real division of future fuzzers is really about generation versus
>> mutation.
>>
>> Simple mutation fuzzers use templates, which you need to modify to  
>> add
>> meta-data on data types. In full model-based fuzzers on the other  
>> hand
>> take the protocol specification and build tests from that. Benefits
>> are obvious:
>>
>> Mutation fuzzers:
>> + quick and dirty fuzzer in less time
>> - will only cover most common elements in the protocol
>> - requires huge amount of tests for any level of confidence
>
> this will highly depend on how the mutations are generated, if  
> randomly or with troublesome values.
>
>> Model-based fuzzers:
>> - building the model is often laborous job
>> + "100%" test coverage (at least feasible)
>> + more optimized test cases, less time used in testing
>
> test coverage against a given protocol, file format or language  
> syntax: won't cover proprietary implementations/extensions which are  
> very common in the real world software. (no theory here)
>
>> In addition to division between mutation and model-based fuzzers, any
>> fuzzer can be categorized based on how it builds the tests:
>>
>> 1) captures and templates
>> 2) specifications
>> 3) test target behavior
>
> That's pretty much what public available fuzzers do, but believe me  
> in this one, there are a lot of private fuzzers that go way beyond  
> this architectures. Mine is one of them, you already heard some  
> things from your colleges about it. ;)
>
>>> The evolution fuzzer from Jared also has a very cool approach to
>>> dynamically analyze and improve the test cases generation, same goes
>>> for fuzzgrind.
>>
>> The "evolution fuzzers" are just one approach to improve mutation
>> fuzzers. I think it is a big step forward for dump pcap fuzzers which
>> just parse traffic from network and start semi-randomly mutating
>> that. It is sad to see that those dumb fuzzers are still widely
>> used. Discussion on evolution fuzzers has no relevance at all to
>> specification-based fuzzers.
>
> I like to think about it as 'evolution technology' instead of  
> 'evolution fuzzers' because is not just based on packet capture and  
> program behaviors, there are a lot of static and runtime analysis  
> involve that makes the testing way more effective than anything else  
> out there. What happens is that nobody that I'm aware of is willing  
> to release to the public this technologies.
> So that you have a quick understanting what I mean, there are  
> frameworks that are basically 'multi-architecture' simulation  
> environments that are themselves constrain solvers on a pseudo  
> process 'snapshot' and 'restore', this aproch attacks the logic,  
> instead of the datatypes that is being parsed. This approach allows  
> to just dump a piece of code of an embed device and simulate it's  
> entire behavior in other to find out the breaking points.
>
> Anyway, it's really interesting to see that random mutation fuzzers  
> are extremely effective even when compared to commercial fuzzers.  
> Have a look at Charlie Miller's presentation on CanSecWest 2008  
> (check the results of ProxyFuzz), what Charlie did not compared the  
> results agaist fuzzing frameworks like peach and sulley.
>
>>> I don't know how many frameworks are out there, I personally  
>>> consider
>>> frameworks things that allow you to create customized complex
>>> datatypes, and agents for protocols communications, debugging,
>>> monitoring, etc. Basically they have the core (APIs, modules, etc)
>>> that allow you to rapidly develop tools to face any given task, that
>>> afterwards you can reuse. That's why I'm willing to know what Ari
>>> considers a framework. For me frameworks are peach, sulley and the  
>>> like.
>>
>> To me, frameworks are development environments for developing
>> fuzzers.
>
> I agree, it is what I've tried to say in the quoted text :)
>
>> To some people, Python itself is already a fuzzing
>> framework. ;)
>
> well to any coder anything that would do what you need is enough ;)
> python, ruby, java, perl, .net, whatever will be a fuzzer, a  
> debugger, etc, etc, etc, it's up you.
> but I believe you mean that given the fact that a lot lot of tools  
> where built such a language. Same would go for people that believes  
> that ruby is an exploitation framework :P
>
>> Most frameworks are actually just libraries of helpful routines and
>> libraries that can be re-used to buid fuzzers. Note that all
>> commercial vendors most probaby also have an internal fuzzing
>> framework. E.g. the Codenomicon framework is perhaps fourth or fifth
>> generation PROTOS.
>
> that's the power of code reuse :)
>
>>> I personally believe that the whole fuzzing thing should be heading/
>>> improving into another direction, that involve complex/intelligent
>>> lexers, path-flow analysis and dynamic instrumentation (among other
>>> things). But that is a huge discussion that won't fit in a set of  
>>> mails.
>>
>> I disagree. For best fuzzing results, you should use one stupid
>> mutation fuzzer, one wicket evolving one, and the most intelligent
>> specification-based fuzzer. They each have strengths and
>> weaknesses. If you need to pick only one:
>>
>> 1) Mutation is often cheapest, but has worst test efficiency. They  
>> are
>>  fast to build and deploy, but can take forever to run.
>>
>> 2) Specification-based fuzzers will have best efficiency, but have a
>>  price, or take a lot of time to build. Test execution time is often
>>  in matter of minutes or hours due to better test optimization.
>>
>> 3) Evolving fuzzers can find some magic tests that cannot be found
>>  with any other means, but miss a lot of simple bugs. AFAIK these
>>  often take the longest time to develop, and to use.
>
> I disagree here, as expected ;)
>
> 1/2/3):
> - When I mean template base I did not refer to use a base file and  
> create a template with it. I mean a template that has the pathflows  
> (academic people would say control flow) of a given datatype/ 
> protocol/whatever, a template that describes the opcodes/commands  
> (TLV and so on). This template when traversed generates test cases  
> that covers all the different 'pathflows'.
> - While launching this test cases against a target, an agent  
> (debugger) on the target process analysis NOT the code coverage, BUT  
> the pathflows that the functions and basic-blocks can have.
> - This agent interacts with the fuzzer to track with test case took  
> what path.
> - After the initial test cases finish, the paths that were taken are  
> eliminated from the list of pathflows, and the ones that where not  
> taken remain there. The remaning paths are cross-references with the  
> adjacent path that were taken, those test cases are the starting  
> point for the evolution approach. The Evolution approach has a  
> constrains solver in order to get to instrument the test cases in  
> order to take those remaining pathflows.
>
> This is part of what my fuzzer actually does ;), this is what I've  
> been working on because that's what I believe the fuzzing approach  
> should head to.
> Of course I might be wrong!
>
>> I hope this was helpful. I would be happy to hear if you disagree  
>> with
>> my categorization of fuzzers.
>
> No disagreement in the categorization, just different 'naming  
> conventions' because I'm not an academic guy. :P
>
> Cheers,
>  Sergio
>
>






More information about the fuzzing mailing list