This commit is contained in:
Simon Meister 2017-11-14 19:57:58 +01:00
parent 183dc72669
commit c157e9e1dd
3 changed files with 7 additions and 7 deletions

View File

@ -196,7 +196,7 @@ predict $\sin(\alpha)$, $\sin(\beta)$, $\sin(\gamma)$ and $t_t^{cam}$ in the sam
Again, we predict a softmax score $o_t^{cam}$ for differentiating between
a still and moving camera.
\subsection{Motion R-CNN network design}
\subsection{Network design}
\label{ssec:design}
\paragraph{Camera motion network}
@ -327,7 +327,7 @@ loss could benefit motion regression by removing any loss balancing issues betwe
rotation, translation and pivot terms \cite{PoseNet2},
which can make it interesting even when 3D motion ground truth is available.
\subsection{Training and Inference}
\subsection{Training and inference}
\label{ssec:training_inference}
\paragraph{Training}
We train the Motion R-CNN RPN and RoI heads in the exact same way as described for Mask R-CNN.

View File

@ -16,7 +16,7 @@ requires estimating depth for each pixel. Generally, stereo input is used for sc
to estimate disparity-based depth, however monocular depth estimation with deep networks is becoming
popular \cite{DeeperDepth}.
\subsection{Convolutional neural networks for dense motion estimation}
\subsection{CNNs for dense motion estimation}
Deep convolutional neural network (CNN) architectures
\cite{ImageNetCNN, VGGNet, ResNet}
became widely popular through numerous successes in classification and recognition tasks.
@ -210,7 +210,7 @@ Figure taken from \cite{ResNet}.
\label{figure:bottleneck}
\end{figure}
\subsection{Region-based convolutional networks}
\subsection{Region-based CNNs}
\label{ssec:rcnn}
We now give an overview of region-based convolutional networks, which are currently by far the
most popular deep networks for object detection, and have recently also been applied to instance segmentation.
@ -439,7 +439,7 @@ Figure taken from \cite{FPN}.
\label{figure:fpn_block}
\end{figure}
\subsection{Mask R-CNN: Training and Inference}
\subsection{Mask R-CNN: Training and inference}
\paragraph{Loss definitions}
For regression, we define the smooth $\ell_1$ regression loss as
\begin{equation}

View File

@ -147,7 +147,7 @@ fn = \sum_k [o^{k,c_k} = 0 \land o^{gt,i_k} = 1].
Analogously, we define error metrics $E_{R}^{cam}$ and $E_{t}^{cam}$ for
predicted camera motions.
\subsection{Virtual KITTI training setup}
\subsection{Virtual KITTI: Training setup}
\label{ssec:setup}
For our initial experiments, we concatenate both RGB frames as
@ -178,7 +178,7 @@ Note that a larger weight prevented the
angle sine estimates from properly converging to the very small values they
are in general expected to output.
\subsection{Virtual KITTI evaluation}
\subsection{Virtual KITTI: Evaluation}
\label{ssec:vkitti}
\begin{figure}[t]