Localization¶
Image → Bounding box
- Train classification model
- Attach new fully-connected "regression head"
- Train regression head with regression loss
- At inference, use both heads
Regression Head Approaches¶
- Class agnostic: one box in total
- Class specific: one box per class
- more intuitive
- works for multiple object localization; For eg: represent human pose with \(k\) joints
Regression Head Position¶
Sliding Window¶
Naive¶
- Run classification + regression head at multiple location on high resolution network
- Combine classifier and regressor predictions across all scales for final prediction
Efficient¶
Convert FC layers into conv layers
Train | |
Inference (larger image) |
Advantage: Extra compute only for extra pixels