基于卷积神经网络的候选区域优化算法

王春哲; 安军社; 姜秀杰; 邢笑雪

doi:10.3788/CO.20191206.1348

基于卷积神经网络的候选区域优化算法

doi: 10.3788/CO.20191206.1348

cstr: 32171.14.CO.20191206.1348

王春哲^{1, 2,},
安军社^1, ,,
姜秀杰¹,
邢笑雪³

1.
中国科学院国家空间科学中心复杂航天系统电子信息技术重点实验室, 北京 100190
2.
中国科学院大学, 北京 100049
3.
长春大学, 吉林长春 130022

基金项目:

国家自然科学基金 61805021

详细信息

作者简介:
王春哲(1989—), 男, 吉林松原人, 博士研究生, 2012年于长春大学获得学士学位, 2015年于长春理工大学获得硕士学位, 主要从事深度学习及目标检测方面的研究。E-mail:wangchunzhe163@sina.com

安军社(1969—), 男, 陕西渭南人, 博士, 研究员, 1992年于北京航空航天大学获得学士学位, 1995年于北京科技大学获得硕士学位, 2004年于西北工业大学获得博士学位, 现为中国科学院国家空间科学中心研究员, 主要从事空间飞行器综合电子系统及深度学习方面的研究。E-mail:anjunshe@nssc.ac.cn

中图分类号: TP394.1
计量
- 文章访问数: 2457
- HTML全文浏览量: 1094
- PDF下载量: 61
- 被引次数: 0
出版历程
- 收稿日期: 2019-05-28
- 修回日期: 2019-06-14
- 刊出日期: 2019-12-01

Region proposal optimization algorithm based on convolutional neural networks

1.
Key Laboratory of Electronics and Information Technology for Space Systems, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China
2.
University of Chinese Academy of Sciences, Beijing 100049, China
3.
Changchun University, Changchun 130022, China

Funds:

National Natural Science Foundation of China 61805021

More Information

Corresponding author: AN Jun-she, E-mail:anjunshe@nssc.ac.cn

摘要

摘要: 在目标检测中，通常使用候选区域提高目标的检测效率。为解决当前候选区域质量较低的问题，本文将卷积边缘特征、显著性及目标位置信息引入到候选区域算法中。首先，利用卷积神经网络将待检测图像生成更富有语义信息的边缘特征，并通过边缘点聚合及边缘组相似性策略，获取每个滑动窗口的边缘信息得分；其次，利用显著性目标的局部特征，统计每个滑动窗口中的目标显著性得分；第三，根据目标可能出现的位置，计算每个滑动窗口中的目标位置信息得分；最后，利用边缘信息、显著性及位置信息的分数确定候选区域。在PASCAL VOC 2007验证集上进行实验，给定10 000个候选区域，交并比取0.7时，所提算法的召回率为90.50%，较Edge Boxes算法提高了3%。每张图像的运行时间大约为0.76 s。结果表明，本文算法可快速产生较高质量的候选区域。
- 计算机视觉 /
- 目标检测 /
- 候选区域 /
- 卷积神经网络 /
- 显著性目标
Abstract: Region proposals are usually used to efficiently detect objects in object detection. In order to solve the problem that the region proposals have low quality, the convolutional edge features, object saliency and position information of objects are introduced into the region proposals algorithm. Firstly, the edge features with semantically meaningful information are generated from the images to be detected using the convolutional neural networks, and the score of edge information for per sliding window is obtained through the strategy of edge clustering and the similarities between the edge groups. Then, the salient object scores of each sliding window are computed using the local features of salient objects. Thirdly, the scores of object position information are calculated according to the location where objects may occur. Finally, the region proposals are determined by three components including edge information scores, salient object scores and the object positions scores. The experimental results in PASCAL VOC 2007 validation set show that given just 10 000 region proposals, the object recall of the proposed algorithm is 90.50%, that is increased by 3% comparing with Edge Boxes with intersection over union threshold of 0.7. The run time of the proposed method is about 0.76 seconds for processing one image, and this demonstrates that our approach can yield a set of region proposals with higher quality at a fast speed.
- computer vision /
- object detection /
- region proposals /
- convolutional neural networks /
- salient object

HTML全文

图 1 所提算法实现框图

Figure 1. Block diagram of the proposed algorithm

下载: 全尺寸图片幻灯片

图 2 RCF结构

Figure 2. The structure of RCF

下载: 全尺寸图片幻灯片

图 3 给定一张图像X

Figure 3. An given image X

下载: 全尺寸图片幻灯片

图 4 X的边缘特征图

Figure 4. Edge feature maps of X

下载: 全尺寸图片幻灯片

图 5 图像块的卡方距离

Figure 5. The chi-square distance of image patches

下载: 全尺寸图片幻灯片

图 6 选取S图像块的策略

Figure 6. Selection strategy of S image patch

下载: 全尺寸图片幻灯片

图 7 目标位置与目标数目关系。(a)VOC 2007数据集；(b)VOC 2012数据集

Figure 7. Relationship between the object′s location and object′s number. (a) VOC 2007 dataset; (b) VOC 2012 dataset

下载: 全尺寸图片幻灯片

图 8 参数α、β与召回率的关系

Figure 8. Relationship of the parameters α, β and recall

下载: 全尺寸图片幻灯片

图 9 参数w与召回率的关系

Figure 9. Relationship of the parameter w and recall

下载: 全尺寸图片幻灯片

图 10 不同候选框数下召回率与交并比之间的关系

Figure 10. Relationship between recall and IoU at different number of proposals

下载: 全尺寸图片幻灯片

图 11 不同交并比的候选框数与召回率的关系

Figure 11. Recall versus number of proposals at different IoUs

下载: 全尺寸图片幻灯片

图 12 13种算法不同位置目标的召回率与交并比的关系

Figure 12. Recall vs IoU curves of objects at different locations by 13 kinds of algorithms

下载: 全尺寸图片幻灯片

图 13 不同交并比下候选框数与召回率的关系

Figure 13. Recall versus number of proposals at different IoUs

下载: 全尺寸图片幻灯片

图 14 本文算法在PASCAL VOC 2007测试集的召回率

Figure 14. Recall on the PASCAL VOC 2007 test set for proposed algorithm in this paper

下载: 全尺寸图片幻灯片

图 15 不同宽高比时测试集及验证集上的召回率

Figure 15. The recalls at different aspect ratios of test set and validation set

下载: 全尺寸图片幻灯片

图 16 所提算法对部分目标的检测结果

Figure 16. Object detection results of some objects detected by proposed algorithm

下载: 全尺寸图片幻灯片

图 17 漏检目标的尺寸与漏检目标数目间的关系

Figure 17. The relation of the size of undetected objects and the number of undetected objects

下载: 全尺寸图片幻灯片

表 1 边缘组算法描述

Table 1. The description of edge group algorithm

下载: 导出CSV

表 2 精调滑动窗口策略

Table 2. The strategy of refining sliding windows

下载: 导出CSV

表 3 VOC 2007数据集特性

Table 3. The properties of VOC 2007 dataset

数据集	训练集	验证集	测试集
图像数	2 501	2 510	4 952
目标数	6 301	6 307	12 032

下载: 导出CSV

表 4 交并比为0.7时13种算法的实验结果

Table 4. The experiment results of 13 kinds of algorithms with IoU of 0.7

Algorithms	AUC	45%	60%	75%	R₁₀₀₀	R₂₀₀₀	R₁₀₀₀₀	mAP	t/s
Object-ness	0.27	--	--	--	37.68%	37.89%	37.93%	51.4	3
BING	0.20	--	--	--	27.04%	27.39%	28.14%	49.0	0.2
CPMC	0.41	86	475	--	62.58%	62.59%	62.60%	57.1	250
SS	0.40	171	530	1 812	68.13%	76.13%	89.12%	59.5	10
EB	0.46	77	234	804	77.39%	83.25%	87.19%	60.4	0.25
Rantalankila	0.23	489	1 712	--	55.79%	61.21%	68.94%	57.9	10
Rand. Prim′s	0.35	274	950	4 095	60.61%	68.52%	79.33%	57.6	1
MCG	0.48	60	240	1 116	74.14%	79.58%	80.53%	60.3	30
Endres	0.44	75	432	--	63.93%	64.69%	64.88%	57.4	100
Geodesic	0.35	266	630	2 491	66.45%	73.65%	81.05%	57.5	1
Rigor	0.30	600	997	1 948	60.08%	75.59%	75.77%	58.4	10
Improved EdgeBoxes	0.46	80	265	802	77.50%	84.15%	89.25%	60.8	0.43
本文算法	0.47	103	276	799	77.87%	84.73%	90.50%	61.3	0.764 9

下载: 导出CSV

参考文献(24)

[1]	梁华, 宋玉龙, 钱锋, 等.基于深度学习的航空对地小目标检测[J].液晶与显示, 2018, 33(9):793-800. http://d.old.wanfangdata.com.cn/Periodical/yjyxs201809011 LIANG H, SONG Y L, QIAN F, et al.. Detection of small target in aerial photography based on deep learning[J]. Chinese Journal of Liquid Crystals and Displays, 2018, 33(9):793-800.(in Chinese) http://d.old.wanfangdata.com.cn/Periodical/yjyxs201809011
[2]	李艳荻, 徐熙平.基于超像素时空特征的视频显著性检测方法[J].光学学报, 2019, 39(1):1-8. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxxb201901025 LI Y D, X X P. Video saliency detection method based on spatiotemporal features of superpixels[J]. Acta Optica Sinica, 2019, 39(01):1-8.(in Chinese) http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxxb201901025
[3]	FATTAL A, KARG M, SCHARFENBERGER C, et al.. Saliency-guided region proposal network for CNN based object detection[C]. IEEE Conference on Intelligent Transportation Systems, Yokohama, Japan.2017: 1-8.
[4]	UIJLINGS J, K VAN DE SANDE, GEVERS T, et al.. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3216de1927eb16418ad3bdf8d4bcd8bd
[5]	ALEXE B, DESELAERS T, FERRARI V. Measuring the objectness of image windows[J]. IEEE Transactions on Software Engineering, 2012, 34(11):2189-2202. http://d.old.wanfangdata.com.cn/Periodical/kzyjc201605006
[6]	CHENG M M, LIU Y, LIN W Y, et al.. BING: Binarized Normed Gradients for Objectness Estimation at 300fps[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA.2014(1): 3286-3293.
[7]	ZITNICK C L, DOLLAR P. Edge boxes: locating object proposals from edges[C]. Proceedings of 13th European Conference on Computer Vision. Zurich, Switzerland, 2014, 8689: 391-405.
[8]	JIANG S, LIANG S, CHEN C, et al.. Class agnostic image common object detection[J]. EEE Transactions on Image Processing, 2019, 28(6):2836-2846.
[9]	HE K M, GEORGIA G, PIOTR D, et al.. Mask R-CNN[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy. 2017, (1): 2980-2988.
[10]	丁鹏, 张叶, 贾平, 等.基于视觉显著性的海面舰船检测技术[J].电子学报, 2018, 46(1):127-134. http://d.old.wanfangdata.com.cn/Periodical/dianzixb201801018 DING P, ZHANG Y, JIA P, et al.. Ship detection on sea surface based on visual saliency[J]. Acta Electronica Sinica, 2018, 46(1):127-134.(in Chinese) http://d.old.wanfangdata.com.cn/Periodical/dianzixb201801018
[11]	李宇, 刘雪莹, 张洪群, 等.基于卷积神经网络的光学遥感图像检索[J].光学精密工程, 2018, 26(1):200-207. http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201801024 LI Y, LIU X Y, ZHANG H Q, et al.. Optical remote sensing image retrieval based on convolutional neural networks[J]. Opt. Precision Eng., 2018, 26(1):200-207.(in Chinese) http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201801024
[12]	LIU Y, CHENG M M, HU X W, et al. Richer convolutional features for edge detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu, Hawaii, USA.2017, (1): 5872-5881.
[13]	ISHIKURA K, KURITA N, CHANDLER D M, et al.. Saliency detection based on multiscale extrema of local perceptual color differences[J]. IEEE Transactions on Image Processing, 2018, 27(2):703-717.
[14]	KUANG P J, ZHOU Z H, WU D C. Improved edge boxes with object saliency and location awards[J]. IEICT Transactions on Information and Systems, 2016, E99D(2):488-495. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=J-STAGE_2188038
[15]	CARREIRA J, SMINCHISESCU C. CPMC:automatic object segmentation using constrained parametric min-cuts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7):1312-1328. http://d.old.wanfangdata.com.cn/Periodical/xlyj200906002
[16]	邝沛江.基于改进Edge Boxes的物体检测算法的研究[D].广州: 华南理工大学, 2017 KUANG P J. Research on algorithm in object detection based on improved edge boxes[D]. Guangzhou: South China University of Technology, 2017.(in Chinese)
[17]	MANEN S, GUILLAUMIN M, VAN GOOL L. Prime object proposals with randomized prim's algorithm[C]. Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia. 2013: 2536-2543.
[18]	RANTALANKILA P, KANNALA J, RAHTU E. Generating object segmentation proposals using global and local search[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA.2014: 2417-2424.
[19]	PONT-TUSET J, ARBELAEZ P, BARRON J T, et al.. Multiscale combinatorial grouping for image segmentation and object proposal generation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(1):128-140. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=303f23167a0a55805d66a6d951514dca
[20]	ENDRES I, HOIEM D. Category-independent object proposals with diverse ranking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2):222-234. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=004584be68a696c66a884743028962f0
[21]	PHILIPP KRAHENBUHL, KOLTUN V. Geodesic object proposals[C]. 2014 European Conference on Computer Vision. Zurich, Switzerland, 2014: 725-739.
[22]	HUMAYUN A, LI F, REHG J M. Rigor: recycling inference in graph cuts for generating object regions[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA. 2014: 2417-2424.
[23]	DANELLJAN M, HÄGER G, KHAN F S, et al.. Convolutional features for correlation filter based visual tracking[C]. IEEE International Conference on Computer Vision Workshop, Santiago, Chile. 2015: 621-629.
[24]	HOSANG J, BENENSON R, DOLLÁR P, et al.. What makes for effective detection proposals[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(4):814-830. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=14eeeed4ff44e76fe35c0417c1062ae5