留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

结合YOLOv11改进的双视角X光图像违禁品检测

吴海滨 刘文柏 袁鹏飞 王爱丽

吴海滨, 刘文柏, 袁鹏飞, 王爱丽. 结合YOLOv11改进的双视角X光图像违禁品检测[J]. 中国光学(中英文). doi: 10.37188/CO.2026-0062
引用本文: 吴海滨, 刘文柏, 袁鹏飞, 王爱丽. 结合YOLOv11改进的双视角X光图像违禁品检测[J]. 中国光学(中英文). doi: 10.37188/CO.2026-0062
WU Hai-bin, LIU Wen-bai, YUAN Peng-fei, WANG Ai-li. Improved prohibited item detection in double-view X-ray images combined with YOLOv11[J]. Chinese Optics. doi: 10.37188/CO.2026-0062
Citation: WU Hai-bin, LIU Wen-bai, YUAN Peng-fei, WANG Ai-li. Improved prohibited item detection in double-view X-ray images combined with YOLOv11[J]. Chinese Optics. doi: 10.37188/CO.2026-0062

结合YOLOv11改进的双视角X光图像违禁品检测

cstr: 32171.14.CO.2026-0062
基金项目: 黑龙江省自然科学基金项目(No. LH2023F034)
详细信息
    作者简介:

    吴海滨(1977—),男,上海人,博士,教授,博士生导师。2000年于哈尔滨工业大学获得学士学位,2002年于哈尔滨工业大学获得硕士学位,2008年于哈尔滨理工大学获得博士学位,现为哈尔滨理工大学教授,主要研究方向为计算机视觉、虚拟现实、遥感图像处理。E-mail: woo@hrbust.edu.cn

    王爱丽(1979—),女,天津人,博士,教授,硕士生导师。2002年于哈尔滨工业大学获得学士学位,2004年于哈尔滨工业大学获得硕士学位,2008年于哈尔滨工业大学获得博士学位,现为哈尔滨理工大学教授,主要研究方向为遥感图像处理。E-mail: aili925@hrbust.edu.cn

  • 中图分类号: TP391.4

Improved prohibited item detection in double-view X-ray images combined with YOLOv11

Funds: Supported by Natural Science Foundation of Heilongjiang Province of China (No. LH2023F034)
More Information
  • 摘要:

    针对现有双视角X光安检图像违禁品检测方法在跨视角特征融合过程中自适应性不足、互补信息利用不充分的问题,本文提出一种结合YOLOv11改进的双视角融合检测方法(Dual View Fusion combined with YOLOv11,DVF-YOLOv11)。该算法采用参数共享的双分支YOLOv11骨干网络分别提取俯视图与侧视图的多尺度特征;设计跨视角注意力融合模块(Cross-View Attention Fusion,CVAF),通过通道注意力与空间注意力的级联机制实现双视角特征的自适应增强;采用自适应权重预测网络动态调整各视角融合权重,结合通道压缩卷积形成双路融合策略;设计由特征保留损失、互补性损失和权重平衡损失组成的联合损失函数引导融合学习。在DvXray数据集上,本文方法的mAP50达到94.02%,mAP50-95达到79.41%,较俯视图单视角分别提升2.99%和5.29%。实验结果表明,本文方法能够提升双视角X光安检图像中违禁品检测的精度与鲁棒性。

     

  • 图 1  DVF-YOLOv11网络结构图

    Figure 1.  Overall architecture of DVF-YOLOv11

    图 2  CVAF模块结构图

    Figure 2.  Structure of the CVAF module

    图 3  通道注意力机制结构图

    Figure 3.  Structure of the channel attention mechanism

    图 4  空间注意力机制结构图

    Figure 4.  Structure of the spatial attention mechanism

    图 5  DvXray中的X光图像示例

    Figure 5.  X-ray images in DvXray

    图 6  检测结果的混淆矩阵

    Figure 6.  Confusion matrix of detection results

    图 7  消融实验训练过程可视化结果图

    Figure 7.  Visualization of ablation experiment training process

    图 8  双视角形态信息获取示例

    Figure 8.  Examples of dual-view morphological information acquisition

    图 9  形态判别能力检测示例

    Figure 9.  Examples of morphological discrimination detection

    图 10  视角检测盲区互补示例

    Figure 10.  Examples of complementary blind zones between two views

    表  1  各损失项的作用

    Table  1.   The role of each loss item

    损失项符号作用权重
    检测损失$ {\mathcal{L}}_{\text{det}} $监督目标检测任务1.0
    特征保留损失$ {\mathcal{L}}_{\text{preserve}} $保持融合特征与原始特征的一致性$ {\lambda }_{1} $
    互补性损失$ {\mathcal{L}}_{\text{comp}} $促进双视角特征的差异性与互补性$ {\lambda }_{2} $
    权重平衡损失$ {\mathcal{L}}_{\text{balance}} $防止权重分配的极端化$ {\lambda }_{3} $
    下载: 导出CSV

    表  2  不同模型检测性能对比

    Table  2.   Comparison of detection performance under different models

    模型视角P(%)R(%)F1(%)mAP50(%)mAP50-95(%)Params(M)GFLOPsFPS
    YOLOv8OL92.0383.5287.5689.6372.483.018.198.3
    SD84.3174.2178.9379.2856.833.018.198.9
    Dual94.2786.5890.2792.4177.034.7517.045.2
    YOLOv10OL91.4282.7686.8788.9171.582.276.575.8
    SD83.6873.4778.2678.6255.932.276.576.3
    Dual93.7185.8689.6191.9276.044.0113.835.6
    YOLOv12OL92.8284.2388.3290.5373.512.515.873.8
    SD85.1474.9279.7180.0257.862.515.874.5
    Dual95.0887.4291.0793.2178.234.2512.435.0
    YOLOv13OL90.7382.1386.2288.1770.682.456.277.1
    SD83.0472.7177.5677.9355.122.456.277.8
    Dual93.0285.1788.9191.2375.144.1913.236.5
    YOLOv11OL93.2884.5188.6991.0374.122.586.3107.3
    SD85.4775.6280.2780.8159.072.586.3108.1
    Dual95.9188.3291.9394.0279.414.3213.449.6
    下载: 导出CSV

    表  3  不同融合方法的性能对比

    Table  3.   Performance comparison of different fusion methods

    方法精确率(%)召回率(%)F1分数(%)mAP50
    (%)
    mAP50-95(%)
    特征拼接94.1385.2789.4892.6176.38
    特征相加94.4284.5389.2192.1475.23
    SE-Net融合[21]95.0886.5790.6893.4277.61
    CBAM融合[22]95.3186.2490.5493.4778.18
    ECA-Net融合[23]94.9387.3891.0293.8378.71
    本文方法95.9188.3291.9394.0279.41
    下载: 导出CSV

    表  4  消融实验结果

    Table  4.   Results of the ablation experiments

    基线
    模型
    A B C D 精确率(%) 召回率(%) mAP50
    (%)
    mAP50
    −95(%)
    93.51 84.82 91.27 74.53
    94.03 85.38 91.82 75.47
    93.76 85.63 91.57 75.68
    93.68 85.14 91.46 75.03
    93.57 84.93 91.38 74.72
    94.52 86.41 92.53 76.87
    94.18 85.82 92.14 76.12
    94.07 85.57 91.93 75.83
    94.03 85.96 92.04 76.28
    93.91 85.78 91.82 76.01
    93.82 85.31 91.63 75.27
    95.37 87.72 93.58 78.63
    95.21 87.48 93.42 78.27
    94.68 86.53 92.71 77.14
    94.47 86.58 92.63 77.23
    95.91 88.32 94.02 79.41
    下载: 导出CSV
  • [1] 林俊豪, 张云飞, 陈少伟, 等. 无监督掩码循环对抗网络实现细胞虚拟染色[J]. 中国光学(中英文), 2026, 19(4), doi: 10.37188/CO.2026-0021. (查阅网上资料,未找到对应的卷期页码信息,请确认).

    LIN J H, ZHANG Y F, CHEN SH W, et al. Unsupervised masked cycle-adversarial network for cellular virtual staining[J]. Chinese Optics, 2026, 19(4), doi: 10.37188/CO.2026-0021. (in Chinese).
    [2] 汪建民, 赵浩冰, 王轲, 等. 无人机飞行单光子动态成像中姿态补偿及重建方法[J]. 中国光学(中英文), 2026, 19(3): 605-618. doi: 10.37188/CO.2026-0004

    WANG J M, ZHAO H B, WANG K, et al. Attitude compensation and reconstruction methods for single-photon dynamic imaging during UAV flight[J]. Chinese Optics, 2026, 19(3): 605-618. doi: 10.37188/CO.2026-0004
    [3] XU Y, ZHANG Q Y, SU Q, et al. PIXDet: prohibited item detection in X-ray image based on whole-process feature fusion and local-global semantic dependency interaction[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5032917. doi: 10.1109/tim.2023.3330184
    [4] WEI Y L, TAO R SH, WU ZH J, et al. Occluded prohibited items detection: an X-ray security inspection benchmark and de-occlusion attention module[C]. Proceedings of the 28th ACM International Conference on Multimedia, ACM, 2020: 138-146.
    [5] TAO R SH, WEI Y L, JIANG X J, et al. Towards real-world X-ray security inspection: a high-quality benchmark and lateral inhibition module for prohibited items detection[C]. 2021 IEEE/CVF International Conference on Computer Vision, IEEE, 2021: 10923-10932.
    [6] ZHU Z M, ZHU Y, WANG H R, et al. FDTNet: enhancing frequency-aware representation for prohibited object detection from X-ray images via dual-stream transformers[J]. Engineering Applications of Artificial Intelligence, 2024, 133: 108076. doi: 10.1016/j.engappai.2024.108076
    [7] 刘建军, 冯沛, 廖威, 等. YOLO-STM: 基于Swin-Transformer与MSDA的X光安检图像危险品识别网络[J]. 中国体视学与图像分析, 2024, 29(3): 230-241. doi: 10.13505/j.1007-1482.2024.29.03.008

    LIU J J, FENG P, LIAO W, et al. YOLO-STM: a network model for identifying prohibited items in X-ray security inspection images based on Swin-Transformer and MSDA[J]. Chinese Journal of Stereology and Image Analysis, 2024, 29(3): 230-241. (in Chinese). doi: 10.13505/j.1007-1482.2024.29.03.008
    [8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2016: 779-788.
    [9] KHANAM R, HUSSAIN M. YOLOv11: an overview of the key architectural enhancements[J]. arXiv preprint arXiv: 2410.17725, 2024. (查阅网上资料, 请核对文献类型及格式).
    [10] STEITZ J M O, SAEEDAN F, ROTH S. Multi-view X-ray R-CNN[C]. Proceedings of the 40th German Conference on Pattern Recognition, Springer, 2019: 153-168.
    [11] TULI A, BOHRA R, MOGHE T, et al. Automatic threat detection in single, stereo (two) and multi view X-ray images[C]. Proceedings of 2020 IEEE 17th India Council International Conference, IEEE, 2020: 1-7.
    [12] WU M D, YI F F, ZHANG H G, et al. Dualray: dual-view X-ray security inspection benchmark and fusion detection framework[C]. Proceedings of the 5th Chinese Conference on Pattern Recognition and Computer Vision, Springer, 2022: 721-734.
    [13] MENG X L, FENG H, REN Y, et al. Transformer-based dual-view X-ray security inspection image analysis[J]. Engineering Applications of Artificial Intelligence, 2024, 138: 109382. doi: 10.1016/j.engappai.2024.109382
    [14] HONG S L, ZHOU Y Z, XU W C. DAGNet: a dual-view attention-guided network for efficient X-ray security inspection[C]. Proceedings of 2025 International Joint Conference on Neural Networks, IEEE, 2025: 1-8.
    [15] TAO R SH, WANG H Y, GUO Y ZH, et al. Dual-view X-ray detection: can AI detect prohibited items from dual-view X-ray images like humans?[C]. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2025: 10338-10347.
    [16] MA B W, JIA T, LI M Y, et al. Toward dual-view X-ray baggage inspection: a large-scale benchmark and adaptive hierarchical cross refinement for prohibited item discovery[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 3866-3878. doi: 10.1109/TIFS.2024.3372797
    [17] VARGHESE R, SAMBATH M. YOLOv8: a novel object detection algorithm with enhanced performance and robustness[C]. Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems, IEEE, 2024: 1-6.
    [18] WANG A, CHEN H, LIU L H, et al. YOLOv10: real-time end-to-end object detection[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Curran Associates Inc. , 2024: 3429.
    [19] TIAN Y J, YE Q X, DOERMANN D. YOLOv12: attention-centric real-time object detectors[J]. arXiv preprint arXiv: 2502.12524, 2025. (查阅网上资料, 请核对文献类型及格式).
    [20] LEI M Q, LI S Q, WU Y H, et al. YOLOv13: real-time object detection with hypergraph-enhanced adaptive visual perception[J]. arXiv preprint arXiv: 2506.17733, 2025. (查阅网上资料, 请核对文献类型及格式).
    [21] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018: 7132-7141.
    [22] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Springer, 2018: 3-19.
    [23] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2020: 11531-11539.
  • 加载中
图(10) / 表(4)
计量
  • 文章访问数:  9
  • HTML全文浏览量:  4
  • PDF下载量:  3
  • 被引次数: 0
出版历程
  • 网络出版日期:  2026-07-04

目录

    /

    返回文章
    返回