Welcome to the Computer Vision Group at RWTH Aachen University!

The Computer Vision group has been established at RWTH Aachen University in context with the Cluster of Excellence "UMIC - Ultra High-Speed Mobile Information and Communication" and is associated with the Chair Computer Sciences 8 - Computer Graphics, Computer Vision, and Multimedia. The group focuses on computer vision applications for mobile devices and robotic or automotive platforms. Our main research areas are visual object recognition, tracking, self-localization, 3D reconstruction, and in particular combinations between those topics.

We offer lectures and seminars about computer vision and machine learning.

You can browse through all our publications and the projects we are working on.

We have one papers accepted at the IEEE International Conference on Computer Vision (ICCV) 2017, 3DRMS Workshop.

Sept. 18, 2017

We have two papers accepted at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017.

June 15, 2017

We have two papers accepted at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017. One oral and one spotlight.

Feb. 28, 2017

We have two papers accepted at the IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.

Jan. 4, 2017

We have a paper on Scene Flow Propagation for Semantic Mapping and Object Discovery in Dynamic Street Scenes at IROS 2016

Aug. 19, 2016

We have three papers accepted at the British Machine Vision Conference (BMVC) 2016.

Aug. 19, 2016

Recent Publications

Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking

Accepted for IEEE Int. Conference on Robotics and Automation (ICRA'18), to appear

The most common paradigm for vision-based multi-object tracking is tracking-by-detection, due to the availability of reliable detectors for several important object categories such as cars and pedestrians. However, future mobile systems will need a capability to cope with rich human-made environments, in which obtaining detectors for every possible object category would be infeasible. In this paper, we propose a model-free multi-object tracking approach that uses a category-agnostic image segmentation method to track objects. We present an efficient segmentation mask-based tracker which associates pixel-precise masks reported by the segmentation. Our approach can utilize semantic information whenever it is available for classifying objects at the track level, while retaining the capability to track generic unknown objects in the absence of such information. We demonstrate experimentally that our approach achieves performance comparable to state-of-the-art tracking-by-detection methods for popular object categories such as cars and pedestrians. Additionally, we show that the proposed method can discover and robustly track a large variety of other objects.


Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video


We explore object discovery and detector adaptation based on unlabeled video sequences captured from a mobile platform. We propose a fully automatic approach for object mining from video which builds upon a generic object tracking approach. By applying this method to three large video datasets from autonomous driving and mobile robotics scenarios, we demonstrate its robustness and generality. Based on the object mining results, we propose a novel approach for unsupervised object discovery by appearance-based clustering. We show that this approach successfully discovers interesting objects relevant to driving scenarios. In addition, we perform self-supervised detector adaptation in order to improve detection performance on the KITTI dataset for existing categories. Our approach has direct relevance for enabling large-scale object learning for autonomous driving.


Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds

IEEE International Conference on Computer Vision (ICCV'17) 3DRMS Workshop

Deep learning approaches have made tremendous progress in the field of semantic segmentation over the past few years. However, most current approaches operate in the 2D image space. Direct semantic segmentation of unstructured 3D point clouds is still an open research problem. The recently proposed PointNet architecture presents an interesting step ahead in that it can operate on unstructured point clouds, achieving decent segmentation results. However, it subdivides the input points into a grid of blocks and processes each such block individually. In this paper, we investigate the question how such an architecture can be extended to incorporate larger-scale spatial context. We build upon PointNet and propose two extensions that enlarge the receptive field over the 3D scene. We evaluate the proposed strategies on challenging indoor and outdoor datasets and show improved results in both scenarios.

Disclaimer Home Visual Computing institute RWTH Aachen University