2022年8月22日

[略读]ObjectBox

From Centers to Boxes for Anchor-Free Object Detection

主要贡献|Keypoints

标签分配|Label Assignment

  • 在三层特征图上预测目标框|Our method predicts bounding boxes at 3 different scales to handle object scale variations.
  • 使用目标中心点所在的特征图格子作为中心,预测格子左上角到右边和底边的距离,格子右下角到左边和顶边的距离|We map the center (x, y) to the center location in the embedding for scale i. We compute the distances from the bottom-right corner to the left and top boundaries (L and T), and the distances from the top-left corner to the right and bottom boundaries (R and B) as follows.
  • 下面是预测值,其中 σ 代表逻辑 sigmoid 函数,(p0, p1, p2, p3) 表示网络预测的距离值| The predictions corresponding to these distances are as follows, where σ stands for the logistic sigmoid function, and (p0, p1, p2, p3) denote the network predictions for distance values
  • 整个网络输出为每个特征位置预测上述距离值+对象分数+类别标签概率|The overall network outputs include one prediction per location per scale, each of which comprises the above-mentioned distance values, as well as an objectness score and a class label for each bounding box
  • 我们的公式确保所有被回归的距离在不同条件下都保持正数|Our formulation ensures that all the distances being regressed remain positive under different conditions.
  • 我们将所有对象视为不同尺度的正样本| More importantly, we treat all the objects as positive samples at different scales.

SDIoU Loss

(ρ = 1)

Loss = 1 − SDIoU.

# Intersection area
S = ((b2_x1 - b1_x1) ** 2) + ((b2_y1 - b1_y1) ** 2) + ((b2_x2 - b1_x2) ** 2) + ((b2_y2 - b1_y2) ** 2)
inter_x1, inter_y1 = torch.min(b1_x1, b2_x1), torch.min(b1_y1, b2_y1)
inter_x2, inter_y2 = torch.min(b1_x2, b2_x2), torch.min(b1_y2, b2_y2)
I = (inter_x1 + inter_x2 - 1) ** 2 + (inter_y1 + inter_y2 - 1) ** 2

# Smallest covering box
cw = torch.max(b1_x1, b2_x1) + torch.max(b1_x2, b2_x2) - 1  # convex (smallest enclosing box) width
ch = torch.max(b1_y1, b2_y1) + torch.max(b1_y2, b2_y2) - 1  # convex height
C = cw ** 2 + ch ** 2 + eps 

rho = 1  # used in development, can be removed
iou = (I - (rho*S)) / C

消融实验|Ablation Study

A. Regression locations

(1) only one location at the center (referred to as ‘center’),

(2) center location augmented with its neighboring locations (as done in ObjectBox, denoted by ‘aug. center’),

(3) the centers of the connecting lines between the box center and two top-left and bottom-right box corner points (referred to as ‘h-centers’),

(4) central locations in (2) plus all locations in (3) (denoted by ‘aug. center + h-centers’),

(5) four corners of the bounding box, (6) corner points in (5) plus the center location.

B. Pred

在这个实验中,我们根据对象中心在该位置的偏移量为每个位置分配了4个预测。具体来说,每个位置被分成四个相等的更精细的位置,每个位置给出一个预测。|In this experiment, we assigned 4 predictions to each location based on the offset of the object center in that location. Specifically, each location was divided into four equal finer locations, with one prediction given to each of them.

C. Scale constraints

  • 物体尺寸在[mi-1,mi]之外的将被认为是负例| An object at scale i is considered as a negative sample if {w, h} < mi−1 or {w, h} > mi for i = 1, 2, 3.
  • 该实验验证了我们选择在所有对象的所有尺度级别上考虑嵌入的选择,因为对特征图进行阈值处理会严重损害结果。| this experiment verifies our choice of considering embeddings in all scale levels for all objects, as thresholding the feature maps drastically hurts the results.

D. Loss Functions

  • 其他Loss函数不适合Anchor Free检测器|These losses are not suitable in anchor-free detectors like ObjectBox
  • 我们提供了更多实验,以验证SDIoU在FCOS等其他anchor-free方法中的有效性| We provide more experiments in the supplemental materials (Sec. S.4) to verify the effectiveness of SDIoU in other anchor-free approaches like FCOS.

效果|Effect

Titan RTX GPU

Share

You may also like...

发表评论

您的电子邮箱地址不会被公开。