Program

09:25–09:30opening
09:30–09:50
Metric Geometry in Action
Ron Kimmel (Technion-Israel Institute of Technology, Israel)

Treating geometric objects as metric spaces allows us to measure similarities between surfaces like faces under expressions, or our body at different poses. The question we try to answer is how should we define such spaces, a definition that would provide computationally feasible processing tools.

09:50–10:15
Combining Laser-Scanning Data and Images for Target Tracking and Scene Modeling
Hongbin Zha (Peking University, China)

Working environments for modern robots have changed into unstructured, dynamic, and outdoor scenes. There emerged several new challenges along with these changes, mainly in perception of both static and moving objects in the scenes. To tackle these challenges, we carried out researches focused on study of advanced perception systems that can simultaneously model static scenes and track moving objects. Our research has a number of new features. Multi-view and multi-type sensors, together with machine learning based algorithms, are used to obtain robust and reliable mapping/tracking results. In addition, a car-based mobile sensor system is developed to explore large sites. This talk presents an overview of the study, in particular focusing on multi-target tracking, and simultaneous 3D mapping and tracking by using the mobile sensing platform.

10:15–10:40
Visual Loop-Closure Detection for Robot SLAM
Hong Zhang (University of Alberta, Edmonton, Canada)

A fundamental problem in robot simultaneous localization and mapping (SLAM) is the detection of loop closure, which refers to the event when a mobile returns to a previously visited place. This talk is concerned with detecting loop closure visually by a robot. I will briefly discuss the literature about robot SLAM, the importance of loop closure, and existing approaches to visual loop closure detection. In contrast to popular bag-of-words approach to this problem, I will present our solution where we directly rely on visual features, rather than their vector-quantized forms. I will also present results of our study in applying efficient nearest neighbor (NN) search algorithms to the feature matching step in visual loop-closure detection.

10:40–11:05
Passive Vision and The Power of Collective Imaging
Robert Pless (Washington University in St. Louis, USA)

The web has an enormous collection of live cameras that image parks, roads, cities, beaches, mountains, buildings, parking lots. Over the last 5 years, I have been working to understand how to effectively use this massively distributed, scalable, and *already existing* camera network. In this talk I will introduce the AMOS (Archive of Many Outdoor Scenes) database, which now includes images from 13000 cameras captured every half hour over the last 4 years.

Algorithms for analyzing this massive data set are inspired by a combination of time-lapse video artists Jason Salavon and Hiroshi Sugimoto and the work on natural scene statistics that grounds many bio-mimetic image representations. I will introduce robust algorithms for automatically geo-locating, and calibrating all these cameras, and for infering 3D scene structure from natural outdoor time-lapses.

One long term goal of this project is to collectively use all the webcams attached to the internet as a novel global sensor to help measure weather, social, and climate changes, and this talk will conclude with some initial work in those directions.

11:05–11:15break
11:15–11:40
Interactive Visualization of Hyperspectral Images of Historical Documents
Michael S. Brown (National University of Singapore, Singapore)

This talk will overview an interactive visualization tool to study and analyze hyperspectral images (HSI) of historical documents. This work is part of a collaborative effort with the Nationaal Archief of the Netherlands (NAN) and Art Innovation, a manufacturer of hyperspectral imaging hardware designed specially for historical documents. To assist their work, we have developed a comprehensive visualization tool that offers an assortment of visualization and analysis methods, including interactive spectral selection, spectral similarity analysis, time-varying data analysis and visualization, and selective spectral band fusion. While this work is more visualization in its focus than computer vision, the overall topic should be interesting to the colloquium participants. In addition, we are keen to hear feedback and suggestions from colloquium participants on this work.

11:40–12:05
Some Thoughts on Food Appearance: The Color Space of Single Attribute Variations
Yaser Yacoob (University of Maryland, College Park, USA)

We consider the intra-image color-space of an object or a scene when these are subject to a dominant single-source of variation. The source of variation can be intrinsic or extrinsic (i.e., imaging conditions) to the object. We observe that the quantized colors for such objects typically lie on a planar subspace of RGB (following the Dichromatic Reflection Model), and in some cases linear or polynomial curves on this plane are effective in capturing these color variations. We illustrate the use of this analysis for: discriminating between shading-change and reflectance-change for patches, and object detection, segmentation and recognition based on a single exemplar.

12:05–12:30
Imaging Spectroscopy: Is there anything to gain from more bands?
Antonio Robles-Kelly (National ICT, Canberra, Australia)

In this seminar, I will introduce imaging spectroscopy, its capabilities and complexities. We will then look at the relationship between spectra and material identification and will explore the challenges and opportunities posed by the high-dimensional data delivered by spectral imagers. Along these lines, I will present how the generally ill-posed problem of recovering the illuminant power spectrum can be render tractable using multispectral imaging and will show results on material identification and re-illumination. If time permits, will also show how multispectral imagery can be used for multiple-instance learning applied to early detection of plant pathogens, where the complexity of the data requires instance pruning as a means to computational efficiency.

12:30–13:45lunch break
13:45–14:10
Nonparametric Higher-order Learning Approach to Interactive Image Segmentation
Kyoung Mu Lee (Seoul National University, Korea)

In this talk, a new generative model for multi-label, interactive segmentation is introduced. To estimate the pixel likelihoods for each label, we propose a new nonparametric higher-order formulation to enforce the soft label consistency in each region generated by unsupervised segmentation. In contrast to the previous parametric approaches, we efficiently consider the pairwise relationship between the pixels and their corresponding regions via a multi-layer graphical model. We show that the formulation becomes two joint quadratic cost functions of pixel and region likelihoods which can be solved simultaneously by a simple optimization technique. In this manner, we consider long-range connections between the regions that facilitate propagation of local grouping cues across larger image areas. The experiments on challenging data sets show that integration of regional higher-order cues significantly improves the segmentation results with detailed boundaries, and reduces sensitivity with respect to seed quantity and placement.

14:10–14:35
Spatial Sampling for Image Segmentation
Mariano Rivera (CIMAT Guanajuato, Mexico)

A novel framework for image segmentation based on the Maximum Likelihood estimator is presented. A common hypothesis for explaining the differences among image regions is that they are generated by sampling different Likelihood Functions called models. We adopt last hypothesis and, additionally, we assume that such samples come from independent and identically distributed random variables. Thus, the probability (likelihood) that a particular model generates the observed value (at a given pixel) is estimated by computing the likelihood of the sample composed with the surrounding pixels. This simple approach allows us to propose efficient segmentation methods able to deal with textured images. Our approach is naturally extended for combining different features. The capabilities of our approach is demonstrated by experiments in interactive image segmentation, automatic stereo analysis, image denoising and reconstruction of of brain water diffusion multi-tensor fields.

14:35–15:00
Robust Fitting on Steroids — Getting the Maximum Information from Residuals
David Suter (University of Adelaide, Australia)

The standard robust fitting methods of the “RANSAC family” (e.g., RANSAC, LMedS, and many newer versions) have two key components or stages: random sampling (generating trial hypotheses) and scoring those hypotheses. Typically the former takes a single summary statistic from the residuals and discards all but the best hypothesis by that measure. This is wasteful of information residing in those residual distributions. Also, the random sampling is very wasteful if not guided somehow. This talk summarises recent work in exploring more effective and efficient approaches that “squeeze” more information from the traditional RANSAC-like stages.

15:00–15:25
Development of Single Pass Connected Components Analysis for FPGA
Donald G. Bailey (Massey University Palmerston North, New Zealand)

FPGAs are increasingly being used as an implementation platform for real-time image processing applications because their structure is able to exploit spatial and temporal parallelism. However many designs do not realise their full potential because many current techniques are optimized for implementation on serial computers. Standard connected components analysis methods divides the process into two distinct steps: connected component labelling, followed by analysis of the resultant connected components. Connected components labelling requires 2 passes through the image, and analysis requires at least one further pass. When implemented on an FPGA, this requires frame buffer memories, and introduces considerable latency. Careful analysis of the problem shows that the labelling and analysis stages may be pipelined, and the frame buffer eliminated. The result is a single pass algorithm. This algorithm is further transformed to reduce the memory requirements, and reduce the latency further. The result is an efficient algorithm that is able to directly process images streamed from a camera.

15:25–15:35break
15:35–16:00
3D Modeling: A New Framework for Emerging Applications
Hamid Krim (North Carolina State University, Raleigh, USA)

Shape analysis is playing an increasingly important role in many applications where object classification and understanding are of interest. Solutions to many existing as well as new emerging applied problems (e.g., object recognition, biometrics etc.) crucially depend on object modeling and their parsimonious representation.

Our unified approach parses geometry and topology of a shape (planar and 3D) and exploits the resulting simplicity to propose simple models amenable to classification and recognition.

We use the biology of vision of bees as a source of inspiration to invoke Morse Theory as well as Whitney embedding theorem to propose novel and powerful weighted graphical models for 2D/3D shapes with a demonstrated simplicity and wide applicability.

16:00–16:25
Local and Global Diffusion Geometry in Non-rigid Shape Analysis
Alex Bronstein (Technion, Haifa, Israel)

Diffusion geometry, scale-space analysis, and study of heat propagation on manifolds have recently become a popular tool in data analysis in a variety of applications. In this talk, we will explore the applications of diffusion geometry to the problems of non-rigid shape representation, comparison, and retrieval. We will show that diffusion processes allow defining both local and global geometric structures. Local shape descriptors based on heat kernels allow representing shapes as collections of geometric “words” and “expressions” and approaching shape similarity as problems in text search and matching. Global structures are diffusion metrics, insensitive to shape deformations and topological changes.

Representing shapes as metric spaces endowed with diffusion distances, we can pose the problem of shape similarity as a comparison of metric spaces using the Gromov-Hausdorff distance. As examples of applications we will show large-scale shape retrieval, correspondence computation, and detection of intrinsic symmetries in non-rigid shapes.

16:25–16:50
Some Metric Geometry Ideas for Matching Shapes
Facundo Memoli (Stanford University, USA)

I will review some ideas revolving around the Gromov-Wasserstein and Gromov-Hausdorff distances for matching shapes and their restriction to shapes exhibiting different levels of smoothness. I will describe connections with different pre-existing methods.

16:50–17:00break
17:00–17:25
Research on Multi-view Reconstruction and Other Topics
Peter Sturm (INRIA Grenoble, France)

I will present some recent works on multi-view 3D reconstruction carried out in my lab, concerning photo-realistic modeling of Lambertian and non-Lambertian objects and real-time multi-view depth map estimation. A short overview of other recent research activities will be given, time permitting.

17:25–17:50
Online Visual Tracking with Histograms and Articulating Blocks
Ming-Hsuan Yang (University of California, Merced, USA)

We propose an algorithm for accurate tracking of articulated objects using online update of appearance and shape. The challenge here is to model foreground appearance with histograms in a way that is both efficient and accurate. In this algorithm, the constantly changing foreground shape is modeled as a small number of rectangular blocks, whose positions within the tracking window are adaptively determined. Under the general assumption of stationary foreground appearance, we show that robust object tracking is possible by adaptively adjusting the locations of these blocks. Implemented in MATLAB without substantial optimization, our tracker runs already at 3.7 frames per second on a 3 GHz machine. Experimental results have demonstrated that the algorithm is able to efficiently track articulated objects undergoing large variation in appearance and shape.

17:50–18:10
Test and Design of Stereo and Motion Techniques for Vision-Based Driver Assistance
Reinhard Klette (University of Auckland, New Zealand)

The talk starts with informing about a way to evaluate stereo image data with respect to their inherent stereo-matching complexity. It then demonstrates the evaluation of stereo and motion techniques on image data of different complexities, with a focus on evaluating on long sequences of 100 or more (stereo) frames. The talk also informs about some conclusions how techniques have been modified for better performance.

18:10–18:15ACCV2010 announcement

Sponsors

  • Asian Federation of Computer Vision Societies (AFCV)
  • Forum for Image Informatics in Japan
  • National Institute of Informatics

In cooperation with

  • IPSJ SIG CVIM

Link