mirror of
https://github.com/tu-darmstadt-informatik/bsc-thesis.git
synced 2025-12-13 09:55:49 +00:00
42 lines
2.0 KiB
TeX
42 lines
2.0 KiB
TeX
\begin{abstract}
|
|
|
|
% Many state of the art energy-minimization approaches to optical flow and scene
|
|
% flow estimation rely on a rigid scene model, where the scene is
|
|
% represented as an ensemble of distinct, rigidly moving components, a static
|
|
% background and a moving camera.
|
|
% By constraining the optimization problem with a physically sound scene model,
|
|
% these approaches enable state-of-the art motion estimation.
|
|
|
|
With the advent of deep learning methods, it has become popular to re-purpose
|
|
generic deep networks for classical computer vision problems involving
|
|
pixel-wise estimation.
|
|
|
|
Following this trend, many recent end-to-end deep learning approaches to optical
|
|
flow and scene flow predict full resolution flow fields with
|
|
a generic network for dense, pixel-wise prediction, thereby ignoring the
|
|
inherent structure of the underlying motion estimation problem and any physical
|
|
constraints within the scene.
|
|
|
|
We introduce a scalable end-to-end deep learning approach for dense motion estimation
|
|
that respects the structure of the scene as being composed of distinct objects,
|
|
thus combining the representation learning benefits and speed of end-to-end deep networks
|
|
with a physically plausible scene model inspired by slanted plane energy-minimization approaches to
|
|
scene flow.
|
|
|
|
Building on recent advanced in region-based convolutional networks (R-CNNs),
|
|
we integrate motion estimation with instance segmentation.
|
|
Given two consecutive frames from a monocular RGB-D camera,
|
|
our resulting end-to-end deep network detects objects with accurate per-pixel
|
|
masks and estimates the 3D motion of each detected object between the frames.
|
|
By additionally estimating a global camera motion in the same network,
|
|
we compose a dense optical flow field based on instance-level and global motion
|
|
predictions. Our network is trained on the synthetic Virtual KITTI dataset,
|
|
which provides ground truth for all components of the system.
|
|
|
|
\end{abstract}
|
|
|
|
\renewcommand{\abstractname}{Zusammenfassung}
|
|
\begin{abstract}
|
|
\todo{german abstract}
|
|
\end{abstract}
|