mirror of
https://github.com/tu-darmstadt-informatik/bsc-thesis.git
synced 2025-12-13 09:55:49 +00:00
WIP
This commit is contained in:
parent
b9e4173f7f
commit
183dc72669
@ -63,10 +63,10 @@ Bei Eingabe von zwei aufeinanderfolgenden frames aus einer monokularen RGB-D
|
||||
Kamera erkennt unser end-to-end Deep Network Objekte mit pixelgenauen Objektmasken
|
||||
und schätzt die 3D-Bewegung jedes erkannten Objekts zwischen den frames ab.
|
||||
Indem wir zusätzlich im selben Netzwerk die globale Kamerabewegung schätzen,
|
||||
setzen wir aus den instanzbasierten und globalen Bewegungsschätzungen dichten
|
||||
optischen Fluss zusammen.
|
||||
setzen wir aus den instanzbasierten und globalen Bewegungsschätzungen ein dichtes
|
||||
optisches Flussfeld zusammen.
|
||||
Wir trainieren unser Netzwerk auf dem synthetischen Virtual KITTI Datensatz,
|
||||
der ground truth für alle Komponenten unseres Systems bereitstellt.
|
||||
der Ground Truth für alle Komponenten unseres Systems bereitstellt.
|
||||
|
||||
|
||||
\end{abstract}
|
||||
|
||||
@ -205,7 +205,7 @@ Batch normalization \cite{BN} is used after every convolution.
|
||||
\caption{
|
||||
ResNet \cite{ResNet} \enquote{bottleneck} convolutional block introduced to reduce computational
|
||||
complexity in deeper network variants, shown here with 256 input and output channels.
|
||||
Figure from \cite{ResNet}.
|
||||
Figure taken from \cite{ResNet}.
|
||||
}
|
||||
\label{figure:bottleneck}
|
||||
\end{figure}
|
||||
@ -278,10 +278,10 @@ The basic Mask R-CNN ResNet-50 architecture is shown in Table \ref{table:maskrcn
|
||||
Note that the per-class masks logits are put through a sigmoid layer, and thus there is no
|
||||
comptetition between classes for the mask prediction branch.
|
||||
|
||||
One important technical aspect of Mask R-CNN is the replacement of RoI pooling with
|
||||
One important additional technical aspect of Mask R-CNN is the replacement of RoI pooling with
|
||||
bilinear sampling for extracting the RoI features, which is much more precise.
|
||||
%In RoI pooling, at the borders, the bins for max-pooling are not aligned with the actual pixel
|
||||
%boundary of the bounding box.
|
||||
In the original RoI pooling from Fast R-CNN, the bins for max-pooling are not aligned with the actual pixel
|
||||
boundary of the bounding box, and thus some detail is lost.
|
||||
|
||||
|
||||
{
|
||||
@ -434,7 +434,7 @@ block (see Figure \ref{figure:fpn_block}).
|
||||
FPN block from \cite{FPN}.
|
||||
Lower resolution features coming from the bottleneck are bilinearly upsampled
|
||||
and added with higher resolution skip connections from the encoder.
|
||||
Figure from \cite{FPN}.
|
||||
Figure taken from \cite{FPN}.
|
||||
}
|
||||
\label{figure:fpn_block}
|
||||
\end{figure}
|
||||
|
||||
@ -48,9 +48,9 @@ often fails to properly segment the pixels into the correct masks or assigns bac
|
||||
\includegraphics[width=\textwidth]{figures/sfmnet_kitti}
|
||||
\caption{
|
||||
Results of SfM-Net \cite{SfmNet} on KITTI \cite{KITTI2015}.
|
||||
From left to right we show, instance segmentation into up to 3 independent objects,
|
||||
From left to right, we show their instance segmentation into up to 3 independent objects,
|
||||
ground truth instance masks for the segmented objects, composed optical flow and ground truth optical flow.
|
||||
Figure from \cite{SfmNet}.
|
||||
Figure taken from \cite{SfmNet}.
|
||||
}
|
||||
\label{figure:sfmnet_kitti}
|
||||
\end{figure}
|
||||
@ -67,7 +67,7 @@ and predicts pixel-precise segmentation masks for each detected object (Figure \
|
||||
\includegraphics[width=\textwidth]{figures/maskrcnn_cs}
|
||||
\caption{
|
||||
Instance segmentation results of Mask R-CNN ResNet-50-FPN \cite{MaskRCNN}
|
||||
on Cityscapes \cite{Cityscapes}. Figure from \cite{MaskRCNN}
|
||||
on Cityscapes \cite{Cityscapes}. Figure taken from \cite{MaskRCNN}.
|
||||
}
|
||||
\label{figure:maskrcnn_cs}
|
||||
\end{figure}
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user