mirror of
https://github.com/tu-darmstadt-informatik/bsc-thesis.git
synced 2025-12-13 09:55:49 +00:00
Merge branch 'master' of github.com:simonmeister/bsc-thesis
This commit is contained in:
commit
e832c23983
@ -89,7 +89,7 @@ predict $\sin(\alpha)$, $\sin(\beta)$, $\sin(\gamma)$ and $t_t^{cam}$ in the sam
|
||||
|
||||
\subsection{Supervision}
|
||||
|
||||
\paragraph{Per-RoI supervision with motion ground truth}
|
||||
\paragraph{Per-RoI supervision with 3D motion ground truth}
|
||||
The most straightforward way to supervise the object motions is by using ground truth
|
||||
motions computed from ground truth object poses, which is in general
|
||||
only practical when training on synthetic datasets.
|
||||
@ -124,7 +124,7 @@ We supervise the camera motion with ground truth analogously to the
|
||||
object motions, with the only difference being that we only have
|
||||
a rotation and translation, but no pivot term for the camera motion.
|
||||
|
||||
\paragraph{Per-RoI supervision \emph{without} motion ground truth}
|
||||
\paragraph{Per-RoI supervision \emph{without} 3D motion ground truth}
|
||||
A more general way to supervise the object motions is a re-projection
|
||||
loss similar to the unsupervised loss in SfM-Net \cite{SfmNet},
|
||||
which we can apply to coordinates within the object bounding boxes,
|
||||
|
||||
@ -1,7 +1,8 @@
|
||||
\subsection{Summary}
|
||||
We have introduced an extension on top of region-based convolutional networks to enable object motion estimation
|
||||
in parallel to instance segmentation.
|
||||
\todo{complete}
|
||||
We have introduced an extension on top of region-based convolutional networks to enable 3D object motion estimation
|
||||
in parallel to instance segmentation, given two consecutive frames. Additionally, our network estimates the 3D
|
||||
motion of the camera between frames. Based on this, we compose optical flow from 3D motions in a end.
|
||||
|
||||
|
||||
\subsection{Future Work}
|
||||
\paragraph{Predicting depth}
|
||||
@ -28,3 +29,11 @@ On Cityscapes, we could continue train the instance segmentation components to
|
||||
improve detection and masks and avoid forgetting instance segmentation.
|
||||
As an alternative to this training scheme, we could investigate training on a pure
|
||||
instance segmentation dataset with unsupervised warping-based proxy losses for the motion (and depth) prediction.
|
||||
|
||||
\paragraph{Temporal consistency}
|
||||
A next step after the two aforementioned ones could be to extend our network to exploit more than two
|
||||
temporally consecutive frames, which has previously been shown to be beneficial in the
|
||||
context of scene flow \cite{TemporalSF}.
|
||||
In fact, by incorporating recurrent neural networks, e.g. LSTMs \cite{LSTM},
|
||||
into our architecture, we could enable temporally consistent motion estimation
|
||||
from image sequences of arbitrary length.
|
||||
|
||||
@ -5,7 +5,7 @@ computations. To make our code easy to extend and flexible, we build on
|
||||
the TensorFlow Object detection API \cite{TensorFlowObjectDetection}, which provides a Faster R-CNN baseline
|
||||
implementation.
|
||||
On top of this, we implemented Mask R-CNN and the Feature Pyramid Network (FPN)
|
||||
as well all extensions for motion estimation and related evaluations
|
||||
as well as extensions for motion estimation and related evaluations
|
||||
and postprocessings. In addition, we generated all ground truth for
|
||||
Motion R-CNN in the form of TFRecords from the raw Virtual KITTI
|
||||
data to enable fast loading during training.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user