Frank Evers
Department of Information and Computing Sciences, Utrecht University
This is the report on the experimentation project of master student (Game and Media Technology)
Frank Evers, carried out under supervision of
Dr. Robby Tan.
The purpose of this project is to acquaint the student with
carrying out an experimentation for research and/or development purposes.
The goal of this experimentation is to do a sparse 3D reconstruction from
videos without any prior knowledge on the camera calibration. Because of
the large (and increasing) availability of online videos on websites such as
youtube, it is interesting again to look at 3D reconstruction from video. The
problem of estimating Structure from Motion (as this is called) has namely
already been researched extensively in the past but now that there is so much
more data available it has become an interesting topic once again. With the
use of this large set of online videos we would then be able to automatically
create 3D reconstructions of scenery all around the world. When combining
this with their meta-data such as title, description and GPS-coordinates,
these reconstructions could be linked to actual places in the world, thus
allowing for a searchable 3D earth.
I used a video of this scene. Here showing frames 1, 50, 100:
And create a sparse 3D reconstruction as this:

top view:
I used a video of this scene. Here showing frames 1 and 23. The dots reprsent
locations where SIFT keypoints have been detected.
And create a sparse 3D reconstruction as this (screenshot taken from approximately the same viewpoint as the original camera):
