Drone-Yolo在无人机数据集上取得了巨大的成功,mAP0.5指标上取得了显著改进,在VisDrone2019-test上增加了13.4%,在VisDrone2019-val上增加了17.40%。这篇文章我首先复现Drone-Yolo,然后,在Drone-Yolo的基础上加入我自己对小目标检测的改进。
文章链接:
https://blog.csdn.net/m0_47867638/article/details/134277375?spm=1001.2014.3001.5501
YOLOv5l summary: 267 layers, 46275213 parameters, 0 gradients, 108.2 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:02<00:00, 5.16it/s]
all 230 1412 0.971 0.93 0.986 0.729
c17 230 131 0.992 0.992 0.995 0.797
c5 230 68 0.953 1 0.994 0.81
helicopter 230 43 0.974 0.907 0.948 0.57
c130 230 85 1 0.981 0.994 0.66
f16 230 57 0.999 0.93 0.975 0.677
b2 230 2 0.971 1 0.995 0.746
other 230 86 0.987 0.915 0.974 0.545
b52 230 70 0.983 0.957 0.981 0.803
kc10 230 62 1 0.977 0.985 0.819
command 230 40 0.971 1 0.986 0.782
f15 230 123 0.992 0.976 0.994 0.655
kc135 230 91 0.988 0.989 0.986 0.699
a10 230 27 1 0.526 0.912 0.391
b1 230 20 0.949 1 0.995 0.719
aew 230 25 0.952 1 0.993 0.781
f22 230 17 0.901 1 0.995 0.763
p3 230 105 0.997 0.99 0.995 0.789
p8 230 1 0.885 1 0.995 0.697
f35 230 32 0.969 0.984 0.985 0.569
f18 230 125 0.974 0.992 0.99 0.806
v22 230 41 0.994 1 0.995 0.641
su-27 230 31 0.987 1 0.995 0.842
il-38 230 27 0.994 1 0.995 0.785
tu-134 230 1 0.879 1 0.995 0.796
su-33 230 2 1 0 0.995 0.846
an-70 230 2 0.943 1 0.995 0.895
tu-22 230 98 0.983 1 0.995 0.788
BiC模块模块,有三个输入,一个输出组成,如下图:
我参照YoloV6中的源码,结合YoloV8,对BiC模块做了适当的修改,适应channel的输入和输出,代码如下:
class BiFusion(nn.Module):
'''BiFusion Block in PAN'''
def __init__(self, in_channels1,in_channels2,in_channels3, out_channels):
super().__init__()
self.cv1 = Conv(in_channels1, out_channels, 1, 1)
self.cv2 = Conv(in_channels2, out_channels, 1, 1)
self.cv3 = Conv(in_channels3, out_channels, 1, 1)
self.cv_out = Conv(out_channels * 3, out_channels, 1, 1)
self.upsample = ConvTranspose(
out_channels,
out_channels,
)
self.downsample = Conv(
out_channels,
out_channels,
3,
2
)
def forward(self, x):
x0 = self.upsample(self.cv1(x[0]))
x1 = self.cv2(x[1])
x2 = self.downsample(self.cv3(x[2]))
x3= self.cv_out(torch.cat((x0, x1, x2), dim=1))
return x3
YOLOv5l summary: 292 layers, 52014989 parameters, 0 gradients, 138.8 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:03<00:00, 4.47it/s]
all 230 1412 0.966 0.94 0.989 0.721
c17 230 131 0.979 0.977 0.993 0.813
c5 230 68 0.948 0.985 0.994 0.823
helicopter 230 43 0.975 0.905 0.985 0.611
c130 230 85 1 0.998 0.995 0.657
f16 230 57 0.991 0.93 0.975 0.645
b2 230 2 1 0.96 0.995 0.597
other 230 86 1 0.954 0.974 0.54
b52 230 70 0.977 0.971 0.972 0.792
kc10 230 62 0.992 0.968 0.984 0.803
command 230 40 0.981 1 0.995 0.792
f15 230 123 0.99 0.984 0.995 0.673
kc135 230 91 0.974 0.989 0.978 0.672
a10 230 27 1 0.929 0.983 0.455
b1 230 20 0.994 1 0.995 0.737
aew 230 25 0.947 1 0.993 0.777
f22 230 17 0.938 1 0.995 0.697
p3 230 105 0.994 1 0.995 0.803
p8 230 1 0.871 1 0.995 0.796
f35 230 32 1 0.827 0.961 0.566
f18 230 125 0.967 0.992 0.988 0.813
v22 230 41 0.993 1 0.995 0.662
su-27 230 31 0.942 1 0.995 0.841
il-38 230 27 0.991 1 0.995 0.772
tu-134 230 1 0.745 1 0.995 0.895
su-33 230 2 1 0 0.995 0.746
an-70 230 2 0.913 1 0.995 0.697
tu-22 230 98 0.992 1 0.995 0.798
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:03<00:00, 4.07it/s]
all 230 1412 0.966 0.921 0.984 0.71
c17 230 131 0.967 0.992 0.994 0.791
c5 230 68 0.974 0.971 0.993 0.821
helicopter 230 43 0.926 0.953 0.956 0.598
c130 230 85 0.988 0.982 0.994 0.672
f16 230 57 0.944 0.93 0.974 0.654
b2 230 2 1 0.723 0.995 0.547
other 230 86 0.918 0.919 0.977 0.505
b52 230 70 0.972 0.957 0.984 0.803
kc10 230 62 0.991 0.968 0.987 0.798
command 230 40 0.99 1 0.995 0.785
f15 230 123 0.946 1 0.993 0.652
kc135 230 91 0.977 0.989 0.992 0.661
a10 230 27 0.982 0.667 0.852 0.371
b1 230 20 0.974 0.95 0.993 0.675
aew 230 25 0.943 1 0.978 0.795
f22 230 17 0.981 1 0.995 0.691
p3 230 105 1 0.97 0.995 0.784
p8 230 1 0.861 1 0.995 0.796
f35 230 32 1 0.902 0.99 0.592
f18 230 125 0.973 0.992 0.983 0.8
v22 230 41 0.994 1 0.995 0.734
su-27 230 31 0.973 1 0.995 0.842
il-38 230 27 0.972 1 0.995 0.784
tu-134 230 1 0.959 1 0.995 0.796
su-33 230 2 1 0 0.995 0.646
an-70 230 2 0.904 1 0.995 0.796
tu-22 230 98 0.968 1 0.995 0.782
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 35/35 [00:24<00:00, 1.43it/s]
all 548 38759 0.529 0.421 0.431 0.257
pedestrian 548 8844 0.597 0.456 0.503 0.234
people 548 5125 0.555 0.389 0.411 0.161
bicycle 548 1287 0.342 0.219 0.196 0.0805
car 548 14064 0.738 0.801 0.822 0.583
van 548 1975 0.511 0.414 0.422 0.305
truck 548 750 0.528 0.402 0.424 0.288
tricycle 548 1045 0.485 0.303 0.303 0.174
awning-tricycle 548 532 0.232 0.179 0.142 0.091
bus 548 251 0.693 0.576 0.592 0.427
motor 548 4886 0.608 0.472 0.499 0.224
Results saved to runs\train\exp6
比论文的结果低一些,这个和batchsize以及epoch有关系! 我选用的epoch为150,batchsize为8。如果按照论文中的300epoch可能会更高一些。 下面的结果训练了300epoch,成绩提升了不少!
300 epochs completed in 11.267 hours.
YOLOv5l summary: 340 layers, 49465140 parameters, 0 gradients, 144.2 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 35/35 [00:13<00:00, 2.53it/s]
all 548 38759 0.562 0.458 0.472 0.285
pedestrian 548 8844 0.608 0.509 0.551 0.264
people 548 5125 0.549 0.424 0.437 0.173
bicycle 548 1287 0.424 0.219 0.239 0.102
car 548 14064 0.746 0.837 0.851 0.612
van 548 1975 0.559 0.464 0.484 0.353
truck 548 750 0.61 0.408 0.438 0.297
tricycle 548 1045 0.495 0.362 0.361 0.205
awning-tricycle 548 532 0.302 0.212 0.173 0.109
bus 548 251 0.746 0.585 0.646 0.482
motor 548 4886 0.581 0.559 0.542 0.251
Results saved to runs\train\exp
YOLOv5l summary: 369 layers, 52601844 parameters, 0 gradients, 174.8 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 35/35 [00:14<00:00, 2.41it/s]
all 548 38759 0.586 0.471 0.49 0.3
pedestrian 548 8844 0.628 0.539 0.582 0.284
people 548 5125 0.566 0.427 0.451 0.185
bicycle 548 1287 0.416 0.235 0.241 0.105
car 548 14064 0.753 0.846 0.859 0.624
van 548 1975 0.621 0.47 0.506 0.369
truck 548 750 0.618 0.416 0.45 0.304
tricycle 548 1045 0.546 0.369 0.372 0.222
awning-tricycle 548 532 0.33 0.224 0.202 0.131
bus 548 251 0.778 0.614 0.678 0.512
motor 548 4886 0.602 0.57 0.559 0.264
提升了一些!
本文尝试了三种不同的改进方法,针对小目标,有明显的提升,大家可以在自己的数据集上做尝试!