Until recently, the management of large image databases has relied exclusively on manually entered alphanumeric annotations. Systems are beginning to emerge in both the research and commercial sectors based on 'content-based' image retrieval, a technique which explicitly manages image assets by directly representing their visual attributes. The Virage image search engine provides an open framework for building such systems. The Virage engine expresses visual features as image 'primitives.' Primitives can be very general (such as color, shape, or texture) or quite domain specific (face recognition, cancer cell detection, etc.). The basic philosophy underlying this architecture is a transformation from the data-rich representation of explicit image pixels to a compact, semantic-rich representation of visually salient characteristics. In practice, the design of such primitives is non-trivial, and is driven by a number of conflicting real-world constraints (e.g. computation time vs. accuracy). The virage engine provides an open framework for developers to 'plug-in' primitives to solve specific image management problems. The architecture has been designed to support both static images and video in a unified paradigm. The infrastructure provided by the Virage engine can be utilized to address high-level problems as well, such as automatic, unsupervised keyword assignment, or image classification.