结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测

吴海滨; 魏喜盈; 刘美红; 王爱丽; 刘赫; 岩堀祐之

doi:10.37188/CO.2021-0078

结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测

doi: 10.37188/CO.2021-0078

cstr: 32171.14.CO.2021-0078

1.
哈尔滨理工大学黑龙江省激光光谱技术及应用重点实验室，黑龙江哈尔滨 150080
2.
中部大学计算机科学学院，爱知春日井 487-8501

基金项目: 国家自然基金科学基金（No. 61671190, No. 61801149）；JSPS科学基金（No. #20K11873）

详细信息

作者简介:
吴海滨（1977—），男，上海人，博士，教授（博导），2002年于哈尔滨工业大学获得硕士学位，2008年于哈尔滨理工大学获得博士学位，现为哈尔滨理工大学测控技术与通信工程学院教授，主要从事机器视觉、医学虚拟现实、深度学习图像分类研究。E-mail：woo@hrbust.edu.cn

王爱丽（1979—），女，天津人，博士，副教授，硕士生导师，2008年于哈尔滨工业大学获得博士学位，现为哈尔滨理工大学测控技术与通信工程学院副教授，主要从事机器视觉、深度学习图像分类研究。E-mail：aili925@hrbust.edu.cn

中图分类号: TP391.4;TH691.9
计量
- 文章访问数: 2297
- HTML全文浏览量: 1250
- PDF下载量: 193
- 被引次数: 0
出版历程
- 收稿日期: 2021-04-13
- 修回日期: 2021-05-11
- 网络出版日期: 2021-08-11
- 刊出日期: 2021-11-19

Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning

WU Hai-bin^1
,,
WEI Xi-ying^1
,,
LIU Mei-hong^1
,,
WANG Ai-li^{1
, ,},
LIU He^1
,,
IWAHORI Yu-ji^2
,

1.
Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China
2.
Department of Computer Science, Chubu University, Kasugai 487-8501, Japan

Funds: Supported by National Natural Science Foundation of China (No. 61671190, No. 61801149); Japan Society for the Promotion of Science (No. #20K11873)

More Information

Corresponding author: aili925@hrbust.edu.cn

摘要

摘要: 由于X光安检图像存在背景复杂，重叠遮挡现象严重，危险品摆放方式、形状差异较大等问题，导致检测难度较高。针对上述问题，本文在YOLOv4的基础上，结合空洞卷积对其网络结构进行改进，加入空洞空间金字塔池化(Atrous Space Pyramid Pooling, ASPP)模型，以此增大感受野，聚合多尺度上下文信息。然后，通过K-means聚类方法生成更适合X光安检危险品检测的初始候选框。其中，模型训练时采用余弦退火优化学习率，进一步加速模型收敛，提高模型检测精度。实验结果表明，本文提出的ASPP-YOLOv4检测算法在SIXRay数据集上的mAP达到85.23%。该方法能有效减少X光安检图像中危险品的误检率，提高小目标危险品的检测能力。
- X光安检图像 /
- YOLOv4 /
- 空洞卷积 /
- 空间金字塔池化 /
- 余弦退火
Abstract: In response to the complex backgrounds of X-ray security images, serious overlap and occlusion phenomena, and the large differences in the placement and shape of dangerous goods, this paper improves the network structure of YOLOv4 for dangerous objects detection by combining atrous convolution with the Atrous Space Pyramid Pooling (ASPP) model to increase receptive field and aggregate multi-scale context information. Then, the K-means clustering method is used to generate an initial candidate frame that is more suitable for dangerous goods detection in X-ray inspection images. Cosine annealing is used to optimize the learning rate in model training to further accelerate model convergence and improve model detection accuracy. The experimental results show that the proposed ASPP-YOLOv4 in this paper can obtain an mAP of 85.23% on the SIXRay dataset. The model can effectively reduce the false detection rate of dangerous goods in X-ray security images and improve the detection ability of small targets.
- X-ray security images /
- YOLOv4 /
- atrous convolution /
- spatial pyramid pooling /
- cosine annealing

HTML全文

图 1 ASPP-YOLOv4模型设计

Figure 1. ASPP-YOLOv4 model design

下载: 全尺寸图片幻灯片

图 2 YOLOv4网络结构

Figure 2. Network structure of YOLOv4

下载: 全尺寸图片幻灯片

图 3 结合ASPP改进的YOLOv4框架

Figure 3. Improved YOLOv4 framework combined with ASPP

下载: 全尺寸图片幻灯片

图 4 训练过程中的Loss下降曲线

Figure 4. Loss decline curves during training process

下载: 全尺寸图片幻灯片

图 5 几种方法对各类危险品的检测结果

Figure 5. Detection results of dangerous goods detected by different algorithms

下载: 全尺寸图片幻灯片

图 6 危险品检测效果

Figure 6. Detection results for dangerous goods

下载: 全尺寸图片幻灯片

表 1 锚框计算结果

Table 1. Calculation results of the anchor

特征图	感受野	anchor
13×13	大	(124×111)
		(171×61)
		(200×151)
26×26	中	(75×34)
		(82×188)
		(93×75)
52×52	小	(24×78)
		(50×67)
		(62×111)

下载: 导出CSV

表 2 余弦退火衰减过程

Table 2. Cosine annealing decay process

算法：余弦退火衰减算法
输入：训练epoch $E_{\rm{p} }$、训练批次${B_{\rm{s}}}$、预热期$ w\_epoch $、预先设置学习率$ \eta {}_{base} $、最大学习率$\eta _{{\rm{max}}}$、最小学习率$\eta _{{\rm{min}}}$、训练样本数$S_{\rm{c}}$；
输出：当前训练学习率$ \eta _t^{} $
步骤：
(1) 初始化总步长$Step{s_{{\rm{total}}} } = \left( { {E_p} \times {S_c} } \right)/{B_s}$ 预热步长$Step{s_{{\rm{warmup}}} } = \left( {w \times {S_{\rm{c}}} } \right)/{B_{\rm{s}}}$
(2) Repeat:
在每次重启之后执行：
更新当前执行的步数$step{}_{{\rm{global}}}$，并记录当前学习率
更新学习率
if $Steps{}_{{\rm{global}}} \lt Steps{}_{{\rm{warmup}}}$:
根据${\eta _t} = \left( {({\eta _{ {\rm{base} } } } - {\eta _{ {\rm{warmup} } } })/Step{s_{ {\rm{warmup} } } } } \right) \times Step{s_{ {\rm{global} } } } + {\eta _{ {\rm{warmup} } } }$计算线性增长的学习率${\eta _{{\rm{warmup}}} }$
else:
根据${\eta _t} = \dfrac{1}{2} \times {\eta _{{\rm{base}}} } \times \cos\;\left( {1 + \left( {{\text{π}} \times \dfrac{ {(Step{s_{{\rm{global}}} } - Step{s_{{\rm{warmup}}} })} }{ {Step{s_{{\rm{total}}} } - Step{s_{{\rm{warmup}}} } } } } \right)} \right)$计算余弦退火的学习率
${\eta _t} = \min({\eta _t},{\eta _{\min} })$

下载: 导出CSV

表 3 训练超参数设计

Table 3. Design of the training hyperparameters

状态	名称	参数
冻结主干网络	batch_size	8
	epoch	50
	最大学习率	1e-3
	最小学习率	1e-6
	Warmup_epoch	10
解冻主干网络	batch_size	2
	epoch	50
	最大学习率	1e-4
	最小学习率	1e-6
	Warmup_epoch	10

下载: 导出CSV

表 4 不同模型的AP比较

Table 4. Comparison of AP for different networks (%)

方法	AP					mAP
方法	Gun	Knife	Wrench	Pliers	Scissors	mAP
YOLOv3	93.18	78.00	68.55	79.69	76.97	79.28
M2Det	95.49	75.70	70.17	83.00	82.96	81.47
SSD	94.91	77.87	74.82	84.51	82.69	82.96
YOLOv4	94.40	81.69	77.38	84.50	77.55	83.11
ASPP-YOLOv4	95.78	81.39	77.84	87.36	83.76	85.23

下载: 导出CSV

表 5 ASPP-YOLOv4的性能分析

Table 5. The performance of ASPP-YOLOv4

类别	AP	Precision	Recall	F1-Measure
Gun	95.78%	98.44%	85.32%	0.91
Knife	81.39%	91.48%	67.40%	0.78
Wrench	77.84%	81.61%	71.05%	0.76
Pliers	87.36%	93.15%	75.79%	0.84
Scissors	83.76%	86.28%	76.23%	0.81

下载: 导出CSV

表 6 YOLOv4改进前后检测性能对比

Table 6. Comparison of YOLOv4 performance before and after improvement

方法	mAP	Precision	Recall	F1-Measure
YOLOv4	83.11%	90.35%	73.00%	0.80
ASPP-YOLOv4	85.23%	90.20%	75.16%	0.82

下载: 导出CSV

参考文献(20)

[1]	鞠默然, 罗海波, 刘广琦, 等. 采用空间注意力机制的红外弱小目标检测网络[J]. 光学精密工程,2021,29(4):843-853. doi: 10.37188/OPE.20212904.0843 JU M R, LUO H B, LIU G Q, et al. Infrared dim and small target detection network based on spatial attention mechanism[J]. Optics and Precision Engineering, 2021, 29(4): 843-853. (in Chinese) doi: 10.37188/OPE.20212904.0843
[2]	马立, 巩笑天, 欧阳航空. Tiny YOLOV3目标检测改进[J]. 光学精密工程,2020,28(4):988-995. MA L, GONG X T, OUYANG H K. Improvement of Tiny YOLOV3 target detection[J]. Optics and Precision Engineering, 2020, 28(4): 988-995. (in Chinese)
[3]	MERY D, SVEC E, ARIAS M, et al. Modern computer vision techniques for X-ray testing in baggage inspection[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2017, 47(4): 682-692. doi: 10.1109/TSMC.2016.2628381
[4]	AYDIN I, KARAKOSE M, AKIN E. A new approach for baggage inspection by using deep convolutional neural networks[C]. 2018 International Conference on Artificial Intelligence and Data Processing (AIDP), IEEE, 2018: 1-6.
[5]	MORRIS T, CHIEN T, GOODMAN E. Convolutional neural networks for automatic threat detection in security X-ray images[C]. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2018: 285-292.
[6]	AKCAY S, KUNDEGORSKI M E, WILLCOCKS C G, et al. Using deep convolutional neural network architectures for object classification and detection within X-ray baggage security imagery[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(9): 2203-2215. doi: 10.1109/TIFS.2018.2812196
[7]	AKÇAY S, ATAPOUR-ABARGHOUEI A, BRECKON T P. Skip-GANomaly: skip connected and adversarially trained encoder-decoder anomaly detection[C]. Proceedings of 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, 2019: 1-8.
[8]	GALVEZ R L, DADIOS E P, BANDALA A A, et al.. Threat object classification in X-ray images using transfer learning[C]. Proceedings of 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), IEEE, 2018: 1-5.
[9]	唐浩漾, 王燕, 张小媛, 等. 基于特征金字塔的X光机危险品检测算法[J]. 西安邮电大学学报,2020,25(2):58-63. TANG H Y, WANG Y, ZHANG X Y, et al. Dangerous goods detection algorithm by X-ray machine based on feature pyramid[J]. Journal of Xi'an University of Posts and Telecommunications, 2020, 25(2): 58-63. (in Chinese)
[10]	张友康, 苏志刚, 张海刚, 等. X光安检图像多尺度违禁品检测[J]. 信号处理,2020,36(7):1096-1106. ZHANG Y K, SU ZH G, ZHANG H G, et al. Multi-scale prohibited item detection in X-ray security image[J]. Journal of Signal Processing, 2020, 36(7): 1096-1106. (in Chinese)
[11]	郭守向, 张良. Yolo-C: 基于单阶段网络的X光图像违禁品检测[J]. 激光与光电子学进展,2021,58(8):0810003. GUO SH X, ZHANG L. Yolo-C: one-stage network for prohibited items detection within X-ray images[J]. Laser &Optoelectronics Progress, 2021, 58(8): 0810003. (in Chinese)
[12]	ZHU Y, ZHANG Y T, ZHANG H G, et al. Data augmentation of X-ray images in baggage inspection based on generative adversarial networks[J]. IEEE Access, 2020, 8: 86536-86544. doi: 10.1109/ACCESS.2020.2992861
[13]	陈科峻, 张叶. 基于YOLO-v3模型压缩的卫星图像船只实时检测[J]. 液晶与显示,2020,35(11):1168-1176. doi: 10.37188/YJYXS20203511.1168 CHEN K J, ZHANG Y. Real-time ship detection in satellite images based on YOLO-v3 model compression[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(11): 1168-1176. (in Chinese) doi: 10.37188/YJYXS20203511.1168
[14]	REDMON J, DIVVALA S, GIRSHICK R, et al.. You only look once: unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016: 779-788.
[15]	刘杨帆, 曹立华, 李宁, 等. 基于YOLOv4的空间红外弱目标检测[J]. 液晶与显示,2021,36(4):615-623. doi: 10.37188/CJLCD.2020-0227 LIU Y F, CAO L H, LI N, et al. Detection of space infrared weak target based on YOLOv4[J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(4): 615-623. (in Chinese) doi: 10.37188/CJLCD.2020-0227
[16]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M.YOLOv4: optimal speed and accuracy of object detection[J/OL]. arXiv: 2004.10934, 2020(2020-04-23). https://arxiv.org/abs/2004.10934.
[17]	MIAO C J, XIE L X, WAN F, et al.. SIXray: A large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images[C]. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2019: 2114-2123.
[18]	REDMON J, FARHADI A.YOLOv3: an incremental improvement[J]. arXiv e-prints arXiv: 1804.02767, 2018.
[19]	ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: a single-shot object detector based on multi-level feature pyramid network[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 9259-9266.
[20]	LIU W, ANGUELOV D, ERHAN D, et al.. SSD: single shot multibox detector[C]. 14th European Conference on Computer Vision (CVPR), Springer, 2016: 21-37.