Supervisor: Associate Professor, Henrik Aanæs, DTU Compute
Co-supervisor: Professor Norbert Krüger, SDU
Abstract:
3D vision technology is the process of estimating 3D geometry from 2D image data. In
recent years, it has reached a maturity that allows for real world usage, no longer being
confined to a laboratory. We see this in the availability of commercial 3D scanners
(e.g. Kinect, RealSense, GOM) and their many applications (e.g self-driving cars,
automation, quality control). However, as any engineer knows the transition from
lab to real world application is not trivial, often with unforeseen challenges. In this
sense, much 3D vision technology is built on an unknown foundation, as there have
been few studies on it’s practical problems and limitations.
This thesis contributes to several subjects within the field of 3D vision with such
studies. These encompass dataset creation, empirical evaluation and system engineering.
Datasets are essential to quantitative evaluation and testing. Thus we have created
datasets for two fields which are lacking in that area.
The first being a dataset for Non-Rigid Structure from Motion (NRSfM). NRSfM
estimates the 3D geometry of a deforming object from a 2D point sequence, thus the
dataset is comprised 2D point sequences with a recorded 3D reference. We accomplished
this using structured light scanning and several stop-motion animatronics.
This allowed for much greater deformation variety than what has previously been
available. Structured light scanning provides dense reference geometry and surface
normals, which allowed us to created occlusion-based missing data for each point
sequence. Something which has not been done before.
The second dataset is built for evaluation of rendering techniques for challenging
scenes. In it, we record a series of images along with precise geometry, radiometry,
environment and camera pose. The intent is for a rendering algorithm to use said
data to recreate the recorded image.
Datasets serves little purpose unless they are used. Therefore, we have applied our
NRSfM dataset to analyze the field using 16 methods representative of the state-ofthe-
art. Our factorial analysis shows not only which methods gives the most precise
results, but also overall trends in the field. For example which deformations are the
most challenging to reconstruct and how the camera impacts reconstruction quality.
We also show that the previous reliance on random missing data has lead to algorithms
that handles the missing data from self-occlusion poorly.
We have also evaluated several structured light techniques on biological material.
Structured light is designed with the assumption of diffuse reflection, but most biological
material has heavy subsurface scattering. We show that this results in subtle,
systematic overestimation of depth (up to 1mm), even for state-of-the-art techniques.
However, we also demonstrate that a large part of this error can be corrected with a
linear, geometry based model.
This thesis also presents some vision-based solutions to practical problem, as some
information can only be gained through application. First, we investigate the interaction
between 3D vision and robotics by engineering a solution for non-rigid bin
picking. Our system shows that the problem is solvable, but error correction remains
a big concern. Errors from multiple sources such as calibration, 3D scanner, segmentation
and pose estimation might seem insignificant individually, but are problematic
when taken as a whole.
Second, we designed an algorithm for automatic measurement of contact surface
areas for usage in tribology testing. The method performs measurements with an
error of less than 0.4μm.
Read more about this thesis in DTU Orbit.
Examiners:
Associate Professor Jens Michael Carstensen, DTU Compute
Associate professor Søren Ingvor Olsen, University of Copenhagen
Senior Lecturer George Vogiatzis, Aston University
Chairman:
Associate Professor Rasmus Reinhold Paulsen, DTU
Everyone is welcome.