On the infeasibility of modeling polymorphic shellcode

Abstract
Polymorphic malcode remains a troubling threat. The ability for malcode to automatically transform into semantically equivalent variants frustrates attempts to rapidly construct a single , simple, easily verifiable representation. We present a quantitative analy- sis of the strengths and limitations of shellcode polymorphism and consider its impact on current intrusion detection practic e. We focus on the nature of shellcode decoding routines. The em- pirical evidence we gather helps show that modeling the class of self-modifying code is likely intractable by known methods, in- cluding both statistical constructs and string signatures . In addi- tion, we develop and present measures that provide insight into the capabilities, strengths, and weaknesses of polymorphic engines. In order to explore countermeasures to future polymorphic threats, we show how to improve polymorphic techniques and create a proof- of-concept engine expressing these improvements. Our results indicate that the class of polymorphic behavior is too greatly spread and varied to model effectively. Our analysis also supplies a novel way to understand the limitations of current signature-based techniques. We conclude that modeling normal content is ultimately a more promising defense mechanism than modeling malicious or abnormal content.

This publication has 21 references indexed in Scilit: