Publications
- 2023
- 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- 2014
- 2013
- 2012
- 2011
- Manhattan Scene Understanding Using Monocular, Stereo, and 3D Features
- Keyframe-based recognition and localization during video-rate parallel tracking and mapping
- Wide-area Augmented Reality using Camera Tracking and Mapping in Multiple Regions
- RSLAM: A System for Large-Scale Mapping in Constant-Time using Stereo
- Shared shape spaces
- Nonlinear shape manifolds as shape priors in level set segmentation and tracking
- Combining Traffic Sign Detection with 3D Tracking Towards Better Driver Assistance
- Regressing Local to Global Shape Properties for Online Segmentation and Tracking
- Robust 3D hand tracking for human computer interaction
- Unsupervised Learning of a Scene-Specific Coarse Gaze Estimator
- gSLIC: a real-time implementation of SLIC superpixel segmentation
- Stable Multi-Target Tracking in Real-Time Surveillance Video
- Gaze Directed Camera Control for Face Image Acquisition
- Hidden View Synthesis using Real-Time Visual SLAM for Simplifying Video Surveillance Analysis
- Active Visual Scene Exploration
- The Acquisition of Coarse Gaze Estimates in Visual Surveillance
- 2010
- 2009
- 2008
- 2007
- 2006
- 2005
- 2004
- 2003
- 2002
- 2001
- 2000
- 1999
- 1998
- 1997
- 1996
- 1995
- 1994
- 1993
- 1992
- 1991
- 1990
- 1989
Stable Multi-Target Tracking in Real-Time Surveillance Video
B Benfold and I D Reid
Proc Computer Vision and Pattern Recognition (CVPR), Colorado Springs, June 2011
Links to Authors: bbenfold ian
Abstract
The majority of existing pedestrian trackers concentrate on maintaining the identities of targets, however systems for remote biometric analysis or activity recognition in surveillance video often require stable bounding-boxes around pedestrians rather than approximate locations. We present a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates. By performing data association over a sliding window of frames, we are able to correct many data association errors and fill in gaps where observations are missed. The approach is multi-threaded and combines asynchronous HOG detections with simultaneous KLT tracking and Markov-Chain Monte-Carlo Data Association (MCMCDA) to provide guaranteed real-time tracking in high definition video. Where previous approaches have used ad-hoc models for data association, we use a more principled approach based on a Minimal Description Length (MDL) objective which accurately models the affinity between observations. We demonstrate by qualitative and quantitative evaluation that the system is capable of providing precise location estimates for large crowds of pedestrians in real-time. To facilitate future performance comparisons, we make a new dataset with hand annotated ground truth head locations publicly available.
Links
PDF: File
(8.7MB)
BIB: citation
Additional Material
Video
The video demonstrates the MCMCDA based head tracking system, which runs at 25fps on 1920x1080 video using a standard desktop computer. The system is capable of obtaining stable head images and is robust to temporary occlusions.
Dataset
The Town Centre video and ground truth data can be found on the project page
Copyright Notice
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.