Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16x16 Words Transformers for Image Recognition at Scale. - References

Journals by Subject

Publish with us

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Article citationsMore>>

Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale.

has been cited by the following article:

TITLE: Polyp Segmentation Network with Dual-Decoder Pyramid Visual Converter

AUTHORS: Qing’an Yao, Jiapeng Liu, Yuncong Feng, Dongwei Zhuang, Yougang Wang

KEYWORDS: Colorectal Polyp Segmentation, Dual-Decoder Architecture, Reverse At-tention Mechanism, Multi-Scale Feature Aggregation, Deep Learning

JOURNAL NAME: Journal of Computer and Communications, Vol.13 No.6, June 30, 2025

ABSTRACT: To address the challenges of morphological irregularity and boundary ambiguity in colorectal polyp image segmentation, we propose a Dual-Decoder Pyramid Vision Transformer Network (DDPVT-Net). This architecture integrates a Pyramid Vision Transformer (PVT) encoder with an innovative dual-decoder design that employs reverse attention mechanisms and multi-scale feature aggregation to effectively handle complex tissue patterns and texture variations. Experimental evaluations demonstrate that DDPVT-Net achieves significant improvements over the standard U-Net, with performance gains of 5.65% in mean Intersection over Union (mIoU) and 3.83% in Dice coefficient on the Kvasir-SEG dataset, along with 5.95% and 4.54% improvements respectively on the CVC-ClinicDB dataset. Notably, independent testing on the ETIS-LaribPolypDB benchmark reveals remarkable enhancements of 26.59% in mIoU and 27.43% in Dice coefficient. These quantitative results validate that DDPVT-Net substantially improves the model’s capability to process polyps with diverse shapes and sizes through enhanced multi-scale contextual understanding and precise boundary localization. The proposed framework demonstrates superior segmentation accuracy and generalization capability, establishing a new state-of-the-art solution for computer-assisted clinical diagnosis in gastrointestinal endoscopy.

Open Access

Articles

Laplacian Maximum Margin Criterion for Image Recognition

Fang Chen, Jing Wang, Quanxue Gao

Journal of Computer and Communications Vol.3 No.11, November 19, 2015

DOI: 10.4236/jcc.2015.311010
Open Access

Articles

A Secure Robust Gray Scale Image Steganography Using Image Segmentation

Mohammed J. Bawaneh, Atef A. Obeidat

Journal of Information Security Vol.7 No.3, April 11, 2016

DOI: 10.4236/jis.2016.73011
Open Access

Articles

Recognition of Greenhouse Cucumber Disease Based on Image Processing Technology

Dong Pixia, Wang Xiangdong

Open Journal of Applied Sciences Vol.3 No.1B, January 22, 2013

DOI: 10.4236/ojapps.2013.31B006
Open Access

Articles

A Recognition Method of Pedestrians’ Running in the Red Light Based on Image

Min Zhang, Chao Li Wang, Yun Feng Ji

Journal of Software Engineering and Applications Vol.7 No.5, May 28, 2014

DOI: 10.4236/jsea.2014.75042
Open Access

Articles

An Improved YOLOv3 Model for Asian Food Image Recognition and Detection

Xiaopei He, Dianhua Wang, Zhijian Qu

Open Journal of Applied Sciences Vol.11 No.12, December 30, 2021

DOI: 10.4236/ojapps.2021.1112098

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals by Subject

Publish with us

Article citationsMore>>

Home

About SCIRP

Service

Policies