Home » Uncategorized

Probabilistic Pentesting

Pentesting tools like Metasploit, Burp, ExploitPack, BeEF, etc. are used by security practitioners to identify possible vulnerability points and to assess compliance with security policies. Pentesting tools come with a library of known exploits that have to be configured or customized for your particular environment.  This configuration typically takes the form of a DSL or a set of fairly complex UIs to configure individual attacks.  

Conventionally the workflow of pentesting is comprised of the following fixed steps (Sarraute):

  • perform a network discovery to obtain a list of all the reachable hosts
  • port scan all the known hosts
  • perform OS detection
  • launch matching exploits against the potentially vulnerable machines

There are two major shortcomings with this approach (1) scanning doesn’t yield perfect knowledge (2) scanning generates significant network traffic and can run for a very long time on a large network (Sarraute).

It is perhaps due to these shortcomings (and maybe 0day exploits) that “most testing tools, provide no guarantee of soundness. Indeed, in the last few years, several reports have shown that state-of-the-art web application scanners fail to detect a significant number of vulnerabilities in test applications” (Doupé).

Often when deterministic approaches prove to be brittle and inefficient, we look to probabilistic methods to do better.   

Sarraute, Buffet and Hoffmann and have explored the application of partially observable Markov decision processes (POMDP) to this domain. They demonstrate that POMDP can increase the efficiency of the pentesting process by making probabilistic assumptions that (when true) can successfully find exploits without following the conventional workflow.  

Our own research is taking it further and applying reinforcement learning and deepnets to this domain.

Figure 2: Left: reinforcement learning problem. Right: Markov decision process. Source: Tambet Matiisen, Nervana Systems 

Figure 2: Left: reinforcement learning problem. Right: Markov decision process. Source: Tambet Matiisen, Nervana Systems 

We are also working on novel ways to collect the massive datasets necessary to train these algorithms. Here we should acknowledge the tireless dedication of hackers, particularly in Russia and China.

The application of probabilistic methods are long overdue in the field of information security. Although research in this area has been extremely limited, initial experiments appear promising. 


Doupé, Adam, Marco Cova, and Giovanni Vigna. Why Johnny can’t pentest: An analysis of black-box web vulnerabilit…. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer Berlin Heidelberg, 2010.

Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. Penetration Testing== POMDP Solving? arXiv preprint arXiv:1306.4714 (2013).