Vision-assisted image editing

Abstract
When I think of image editing, packages such as Pho-toshop and Paint Shop Pro come to mind. These packages are used to edit, transform, or manipulate, typically with extensive user guidance, one or more images to produce a desired result. Tasks such as selection, matting, blending, warping, morphing, etc. are often tedious and time con-suming. Vision-assisted editing can lighten the burden, whether the goal is a simple cut and paste composition or a major special effect for a movie (see Doug Roble's article in this issue). Thus, this article focuses on computer vision techniques that reduce (often significantly) the time and effort involved in editing images and video. The goal of vision systems is to detect edges, regions, shapes, surface features, lighting properties, 3-D geome-try, etc. Most currently available image editing tools and filters utilize low-level, 2-D geometric or image process-ing operations that manipulate pixels. However, vision techniques extract descriptive object or scene information; thus allowing a user to edit in terms of higher-level fea-tures. Fully automatic computer vision remains a major focus in the computer vision community. Complete auto-mation is certainly preferred for such tasks as robotic nav-igation, image/video compression, model driven object delineation, multiple image correspondence, image-based modeling, or anytime autonomous interpretation of images/video is desired. However, general purpose image editing will continue to require human guidance due to the essential role of the user in the creative process and in identifying which image components are of interest. Most vision-assisted image editing techniques fall somewhere between user-assisted vision and vision-based interaction. User-assisted vision describes those tech-niques where the user interacts in image (or parameter) space to begin and/or guide a vision algorithm so that it produces a desired result. For example, Photoshop's magic wand computes a connected region of similar pixels based on a mouse click in the area to be selected. Vision-based interaction refers to those methods where the computer has done some or all of the "vision" part and the user interacts within the resulting vision-based feature space. One exam-ple is the ICE (Interactive Contour Editing) system [2] that computes an image's edge representation and then allows a user to interactively select edge groupings to extract or remove image features. A tool is classified based on where a user can "touch" the data of the underlying vision function, the process that computes results from inputs. User-assisted vision manipulates the input (or domain) space of the vision

This publication has 11 references indexed in Scilit: