Thursday, January 21, 2010

How bad is it?

Thank you to John Menerick and Ben Nagy for entertaining my questions on the Daily Dave list.

Q: Is the recent ie6 0-day anything special?

John: Not really. Not as special as the NT <-> Win 7 issue recently highlighted.

Q: How many similar 0-days are for sale on the black market?

John: Quite a few.

Ben: I'd love to see your basis for this assertion. I'm not saying that in the "I don't believe you" sense, only in the "everyone always says that but nobody ever puts up any facts" sense.

Q: What is the rate/difficulty for discovery of new windows-based 0-days for the common MS and Adobe products that are installed on almost every corporate client? (I heard Dave mention that discovery is getting more difficult)?

John: Not terribly difficult for someone who is dedicated. Then again, my idea of difficult is much different from the avg. person

Ben: I think that while finding 0-days might be 'not terribly difficult', selecting and properly weaponising useful 0-days from the masses of dreck your fuzzer spits out IS difficult - at least in my experience. There was some discussion of the 'too many bugs' problem on this list previously and I know several of the other fuzzing guys are currently researching the same area. Of course you'd explain this to your 'avg. person', as well as explaining that the skillset for finding bugs is not necessarily the same as the skillset for writing reliable exploits for them, and that 'dedication' may not sufficiently substitute for either.

Lurene Grenier: I really feel that the "selecting good crashes" problem is not that hard to overcome if you have a proper bucketing system, and the ability to do just a bit of auto-triage at crash time. For example, the fuzzer I use now both separates crashes by what it perceives to be the base issue at hand, and provides a brief notes file with some information about the crash and what is controlled. This requires just a bit of sense in providing fuzzed input, and very little smarts on the part of the debugger. I really think the next step is automating that brain-jutsu; much of it is hard to keep in your head, but not hard to do in code.

Using this output, it's pretty easy to spend a lazy morning with your coffee grepping the notes files for the sorts of things you usually find to be reliably exploitable. From there you can call in your 30 ninjas and have at.

Creating reliable exploits is for sure the hardest part, but once you've done the initial work on a program, the next few exploits in it are of course more quickly and easily done.

As for the thought experiment, I think that the benefit of the top four researchers is that they've trained themselves over a long period of time (and with passion) to have a very good set of pattern-recognition tools which they call instincts. They know how to get crashes, and they know having seen one crash what's likely to find more. They know how to think about a process to get proper execution, and they're rewarded by success emotionally which makes the lesson learned this time around stick for when they need it again.

I honestly think that there is more pattern recognition "muscle-memory" type skill involved in RE, bug hunting, and exploit dev than pure mechanical process, which is why the numbers are so
skewed. It's like taking 4 native speakers of a language (who love to read!) and 100 students of general linguistics with a zillion dollars. Who will read a book in the language faster?

Q: How easy is discovery for someone with resources like the Chinese government?

John: Much simpler.

Ben: Setting aside the previous point that discovery is only the start, I think it's instructive to consider which elements of the process scale well with money.

Finding the bugs: You need a fuzzing infrastructure that scales - running peach on one laptop with 30 ninjas standing around it with IDA Pro open is not going to work. Also consider tracking what you've already tested, tracking the results, storing all the crashes, blah blah blah. This does scale well with money, but it's an area that not as many people have looked at as I would like.

Seeing which bugs are exploitable: Using a naive approach, this scales horribly poorly with money - non-linearly, to put it mildly. There are only so many analysts you will be able to hire that have enough smarts to look at a non-trivial bug and correctly determine its exploitability. You only have to look at some of the Immunity guys' (hi Kostya) records with turning bugs that other people had discarded as DoS or "Just Too Hard" into tight exploits. Even for ninjas, it's slow. There is research being done into doing 'some' of this process automatically (well, I'm doing some, and I know a couple of other guys are too, so that counts), but I don't know of anyone that has a great result in the area yet - I'd love to be corrected.

Creating nice, reliable exploits: I'd assert that this is like the previous point, but even harder. To be honest, it's not really my thing, so probably one of the people that write exploits for a living would be better to comment, but from talking to those kind of guys, it's often a very long road from 'woo we control ebx' to reliable exploitation, especially against modern OSes and modern software that has lots of stuff built in to make your life harder. I don't know how much of the process can really be automated - I mean there are some nice things like the (old now) EEREAP and newer windbg extensions from the Metasploit guys that will find you jump targets according to parameters and so forth, but up until now I was labouring under the impression that a lot of it remains brain-jitsu, which is hard to scale linearly with money.

So, while I think that 'simpler' is certainly unassailable, I would need more than a two word assertion to be convinced that it is 'much' simpler. If you give one team a million dollars and 100 people selected at random from the top 10% graduating computer science and you give the other team their pick of any 4 researchers in the world and 3 imacs, whom does the smart money think will produce more weapons grade 0day after 6 months?

(No it's not a fair comparison. It's a thought experiment.)

Food for thought, perhaps, since sound bites need little care and feeding.

Q: How bad is it really?

John: Look at the CVSSv2 score and adjust it to the environments where you determine "how bad it is." It could be much worse.

Q: I suspect we are just looking at one grain of sand in a beach of 0-days....

John: Correct. No one wants to let everyone else know what cards they hold in their hand, the tools in their toolbox, etc....