Are we ready for autonomous driving? The KITTI vision benchmark suite
Top Cited Papers
- 1 June 2012
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636919,p. 3354-3361
- https://doi.org/10.1109/cvpr.2012.6248074
Abstract
Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection. Our recording platform is equipped with four high resolution video cameras, a Velodyne laser scanner and a state-of-the-art localization system. Our benchmarks comprise 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios (up to 15 cars and 30 pedestrians are visible per image). Results from state-of-the-art algorithms reveal that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world. Our goal is to reduce this bias by providing challenging benchmarks with novel difficulties to the computer vision community. Our benchmarks are available online at: www.cvlibs.net/datasets/kitti.Keywords
This publication has 31 references indexed in Scilit:
- Pushing the limits of stereo using variational stereo estimationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Automatic camera and range sensor calibration using a single shotPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Pedestrian Detection: An Evaluation of the State of the ArtIEEE Transactions on Pattern Analysis and Machine Intelligence, 2011
- Ford Campus vision and lidar data setThe International Journal of Robotics Research, 2011
- Object Detection with Discriminatively Trained Part-Based ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- Pose estimation for category specific multiview object localizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- LabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, 2007
- Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categoriesComputer Vision and Image Understanding, 2007
- Depth and Appearance for Mobile Scene AnalysisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- Simultaneous robot-world and hand-eye calibrationIEEE Transactions on Robotics and Automation, 1998