Abstract
In this paper we study the problem of acquiring a topological model of indoors environment by means of visual sensing and subsequent localization given the model. The resulting model consists of a set of locations and neighborhood relationships between them. Each location in the model is represented by a collection of representative views and their associated descriptors selected from a temporally sub-sampled video stream captured by a mobile robot during exploration. We compare the recognition performance using global image histograms as well as local scale-invariant features as image descriptors, demonstrate their strengths and weaknesses and show how to model the spatial relationships between individual locations by a Hidden Markov Model. The quality of the acquired model is tested in the localization stage by means of location recognition: given a new view or a sequence of views, the most likely location where that view came from is determined.

This publication has 16 references indexed in Scilit: