闲庭信步: RTL--Real Time Tracking and Localization Based on Object Recognition

In previous article, I mentioned that I had developed a real-time localization system with a single USB camera, which is featured as real-time tracking and localization, robust object recognition. Video demos can be found here.
Recently, I've advanced the previous work and developed the RTL system(Real-time Tracking and Localization based on Object Recognition, or Recognition-Tracking-Localization), which incorporates recognition, real-time tracking and localization. Features of the system are as following :

Accurate and fast recognition
Active tracking
3D pose estimation for coplanar objects
Real-time performance
Re-localization and no accumulating error
Multi-object RTL

Also some limitations to localization :

Purely based on visual landmarks
Only for coplanar visual landmarks
Distance should be measured when taking landmarks
Occasional mis-tracking, thus false localization

TODO list :

Improve localization algorithm, considering SFM, invert depth, etc
GUI based on OpenGL as that of MonoSLAM

At present, visual localization and mapping is a very active research topic. Davison's MonoSLAM, in some sense, provides a new approach to vSLAM by combining tracking , EKF, sparse map and active vision to achieve real-time performance and localization accuracy. R O Castle,etc, promoted MonoSLAM by incorporating object recognition, enabling MonoSLAM to re-localize itself and to eliminate accumulating errors of the tracking system. My RTL, inspired by their work, on the other hand, focuses on the real-time active tracking and localization, but not SLAM. So I choose KLT and SIFT-based recognition system, instead of EKF, sparse map and active vision, for fast tracking and robust object recognition. Below is the comparison between RTL and Castle's recent result(BMVC 2007) :

	RTL	MonoSLAM
Image Size	640x480	640x480
SIFT Features	500	500
Extracting Time	250ms	700ms
Matching Time	30ms	100ms
Tracking Points	50(average)	20
Tracking Time	12.5ms	10ms
Object Models	14	16
Database Capacity	33,010	32,000

RTL demo running on P4 2.8G, dual core CPU, 1G memory, Winxp
Image Size: 640x480, video frame rate: 20fps
Note: location information is not revealed in the video demo as that of MonoSLAM, since it may take me some time and efforts to develop such a GUI with opengl. Maybe I'll do it later.
1

visual landmarks used

in the video demo :

闲庭信步

2007年8月16日星期四

RTL--Real Time Tracking and Localization Based on Object Recognition

2 条评论:

我的简介

博客归档

Favorate Links