mirror of
https://github.com/tu-darmstadt-informatik/bsc-thesis.git
synced 2025-12-13 09:55:49 +00:00
Merge branch 'master' of github.com:simonmeister/bsc-thesis
This commit is contained in:
commit
e832c23983
@ -89,7 +89,7 @@ predict $\sin(\alpha)$, $\sin(\beta)$, $\sin(\gamma)$ and $t_t^{cam}$ in the sam
|
|||||||
|
|
||||||
\subsection{Supervision}
|
\subsection{Supervision}
|
||||||
|
|
||||||
\paragraph{Per-RoI supervision with motion ground truth}
|
\paragraph{Per-RoI supervision with 3D motion ground truth}
|
||||||
The most straightforward way to supervise the object motions is by using ground truth
|
The most straightforward way to supervise the object motions is by using ground truth
|
||||||
motions computed from ground truth object poses, which is in general
|
motions computed from ground truth object poses, which is in general
|
||||||
only practical when training on synthetic datasets.
|
only practical when training on synthetic datasets.
|
||||||
@ -124,7 +124,7 @@ We supervise the camera motion with ground truth analogously to the
|
|||||||
object motions, with the only difference being that we only have
|
object motions, with the only difference being that we only have
|
||||||
a rotation and translation, but no pivot term for the camera motion.
|
a rotation and translation, but no pivot term for the camera motion.
|
||||||
|
|
||||||
\paragraph{Per-RoI supervision \emph{without} motion ground truth}
|
\paragraph{Per-RoI supervision \emph{without} 3D motion ground truth}
|
||||||
A more general way to supervise the object motions is a re-projection
|
A more general way to supervise the object motions is a re-projection
|
||||||
loss similar to the unsupervised loss in SfM-Net \cite{SfmNet},
|
loss similar to the unsupervised loss in SfM-Net \cite{SfmNet},
|
||||||
which we can apply to coordinates within the object bounding boxes,
|
which we can apply to coordinates within the object bounding boxes,
|
||||||
|
|||||||
@ -1,7 +1,8 @@
|
|||||||
\subsection{Summary}
|
\subsection{Summary}
|
||||||
We have introduced an extension on top of region-based convolutional networks to enable object motion estimation
|
We have introduced an extension on top of region-based convolutional networks to enable 3D object motion estimation
|
||||||
in parallel to instance segmentation.
|
in parallel to instance segmentation, given two consecutive frames. Additionally, our network estimates the 3D
|
||||||
\todo{complete}
|
motion of the camera between frames. Based on this, we compose optical flow from 3D motions in a end.
|
||||||
|
|
||||||
|
|
||||||
\subsection{Future Work}
|
\subsection{Future Work}
|
||||||
\paragraph{Predicting depth}
|
\paragraph{Predicting depth}
|
||||||
@ -28,3 +29,11 @@ On Cityscapes, we could continue train the instance segmentation components to
|
|||||||
improve detection and masks and avoid forgetting instance segmentation.
|
improve detection and masks and avoid forgetting instance segmentation.
|
||||||
As an alternative to this training scheme, we could investigate training on a pure
|
As an alternative to this training scheme, we could investigate training on a pure
|
||||||
instance segmentation dataset with unsupervised warping-based proxy losses for the motion (and depth) prediction.
|
instance segmentation dataset with unsupervised warping-based proxy losses for the motion (and depth) prediction.
|
||||||
|
|
||||||
|
\paragraph{Temporal consistency}
|
||||||
|
A next step after the two aforementioned ones could be to extend our network to exploit more than two
|
||||||
|
temporally consecutive frames, which has previously been shown to be beneficial in the
|
||||||
|
context of scene flow \cite{TemporalSF}.
|
||||||
|
In fact, by incorporating recurrent neural networks, e.g. LSTMs \cite{LSTM},
|
||||||
|
into our architecture, we could enable temporally consistent motion estimation
|
||||||
|
from image sequences of arbitrary length.
|
||||||
|
|||||||
@ -5,7 +5,7 @@ computations. To make our code easy to extend and flexible, we build on
|
|||||||
the TensorFlow Object detection API \cite{TensorFlowObjectDetection}, which provides a Faster R-CNN baseline
|
the TensorFlow Object detection API \cite{TensorFlowObjectDetection}, which provides a Faster R-CNN baseline
|
||||||
implementation.
|
implementation.
|
||||||
On top of this, we implemented Mask R-CNN and the Feature Pyramid Network (FPN)
|
On top of this, we implemented Mask R-CNN and the Feature Pyramid Network (FPN)
|
||||||
as well all extensions for motion estimation and related evaluations
|
as well as extensions for motion estimation and related evaluations
|
||||||
and postprocessings. In addition, we generated all ground truth for
|
and postprocessings. In addition, we generated all ground truth for
|
||||||
Motion R-CNN in the form of TFRecords from the raw Virtual KITTI
|
Motion R-CNN in the form of TFRecords from the raw Virtual KITTI
|
||||||
data to enable fast loading during training.
|
data to enable fast loading during training.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user