Volume- 12
Issue- 5
Year- 2024
DOI: 10.55524/ijircst.2024.12.5.13 | DOI URL: https://doi.org/10.55524/ijircst.2024.12.5.13 Crossref
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0)
Article Tools: Print the Abstract | Indexing metadata | How to cite item | Email this article | Post a Comment
Bingxing Wang , Hongye Zheng, Yingbin Liang, Guanming Huang, Junliang Du
For the intricate task of multi-label image classification, this paper introduces an innovative approach: an attention-guided dual-branch dynamic graph convolutional network. This methodology is designed to address the difficulties faced by current models when handling multiple labels within images. By integrating multi-scale features, it enhances the retention of original category information and boosts the robustness of feature learning. Utilizing a semantic attention module, the study dynamically reweights feature categories in the training dataset, enhancing the network's capability to identify smaller objects and generate context-sensitive category representations. The effectiveness of the proposed model was evaluated using the MS-COCO2014 imagery dataset, demonstrating superior performance in critical metrics such as classification precision (CP), recall (CR), and F1 score (CF1), outperforming other state-of-the-art models. Furthermore, a cascaded classification structure was implemented to leverage the prior information from static images to inform the processing of dynamic ones, and to utilize original image category data to augment label correlations, thereby enhancing overall classification accuracy.
[1] R. Abdelfattah, Q. Guo, X. Li, et al., "Cdul: Clip-driven unsupervised learning for multi-label image classification," Proc. IEEE/CVF Int. Conf. Comput. Vis., pp. 1348-1357, 2023. Available from: https://doi.org/10.48550/arXiv.2307.16634
[2] M. Li, D. Wang, X. Liu, et al., "Patchct: Aligning patch set and label set with conditional transport for multi-label image classification," Proc. IEEE/CVF Int. Conf. Comput. Vis., pp. 15348-15358, 2023. Available From: https://doi.org/10.48550/arXiv.2307.09066
[3] J. Zhang, J. Ren, Q. Zhang, et al., "Spatial context-aware object-attentional network for multi-label image classification," IEEE Trans. Image Process., vol. 32, pp. 3000-3012, 2023. Available From: https://ieeexplore.ieee.org/abstract/document/10122681
[4] B. Dang, D. Ma, S. Li, Z. Qi, and E. Zhu, "Deep learning-based snore sound analysis for the detection of night-time breathing disorders," Appl. Comput. Eng., vol. 76, pp. 109-114, Jul. 2024. doi: 10.54254/2755-2721/76/20240574. Available From: https://www.researchgate.net/publication/382297062
[5] J. Yao, T. Wu, and X. Zhang, "Improving depth gradient continuity in transformers: A comparative study on monocular depth estimation with CNN," arXiv preprint arXiv:2308.08333, 2023. Available From: https://arxiv.org/abs/2308.08333
[6] J. Ouyang, Q. Lv, S. Zhang, et al., "Energy transfer contrast network for unsupervised domain adaption," Proc. Int. Conf. Multimedia Model., Cham: Springer Nature Switzerland, pp. 115-126, 2023. Available From: https://link.springer.com/chapter/10.1007/978-3-031- 27818-1_10
[7] B. Dang, W. Zhao, Y. Li, D. Ma, Q. Yu, and E. Y. Zhu, "Real-time pill identification for the visually impaired using deep learning," Proc. 2024 6th Int. Conf. Commun., Inf. Syst. Comput. Eng. (CISCE), pp. 552-555, 2024. doi: 10.1109/CISCE62493.2024.10653353. Available From: https://doi.org/10.48550/arXiv.2405.05983
[8] D. Lin, "Probability guided loss for long-tailed multi-label image classification," Proc. AAAI Conf. Artif. Intell., vol. 37, no. 2, pp. 1577-1585, 2023. Available From: https://doi.org/10.1609/aaai.v37i2.25244
[9] X. Qu, H. Che, J. Huang, et al., "Multi-layered semantic representation network for multi-label image classification," Int. J. Mach. Learn. Cybern., vol. 14, no. 10, pp. 3427-3435, 2023. Available From: https://link.springer.com/article/10.1007/s13042-023-01841-6
[10] L. Wu, Y. Luo, B. Zhu, G. Liu, R. Wang, and Q. Yu, "Graph neural network framework for sentiment analysis using syntactic feature," arXiv preprint arXiv:2409.14000, 2024. Available From: https://doi.org/10.48550/arXiv.2409.14000
[11] W. Gu, M. Sun, B. Liu, K. Xu, and M. Sui, "Adaptive spatio-temporal aggregation for temporal dynamic graph-based fraud risk detection," J. Comput. Technol. Softw., vol. 3, no. 5, 2024. Available From: https://ashpress.org/index.php/jcts/article/view/75
[12] B. Liu, J. Chen, R. Wang, J. Huang, Y. Luo, and J. Wei, "Optimizing news text classification with Bi-LSTM and attention mechanism for efficient data processing," Proc. 2024 IEEE Int. Conf. Big Data Smart Comput., 2024. Available From: https://doi.org/10.48550/arXiv.2409.15576
[13] W. He, R. Bao, Y. Cang, J. Wei, Y. Zhang, and J. Hu, "Axial attention transformer networks: A new frontier in breast cancer detection," arXiv preprint arXiv:2409.12347, 2024. Available From: https://doi.org/10.48550/arXiv.2409.12347
[14] Y. Liang, Y. Zhang, Z. Ye, and Z. Chen, "Enhanced unsupervised image registration via dense U-Net and channel attention," J. Comput. Sci. Softw. Appl., vol. 4, no. 5, pp. 8-15, 2024. Available From: https://doi.org/10.5281/zenodo.13643091
[15] Z. Zhang, J. Chen, W. Shi, L. Yi, C. Wang, and Q. Yu, "Contrastive learning for knowledge-based question generation in large language models," arXiv preprint arXiv:2409.13994, 2024. Available From: https://doi.org/10.48550/arXiv.2409.13994
[16] X. Yan, Y. Jiang, W. Liu, D. Yi, H. Sang, and J. Wei, "Data-driven spatiotemporal feature representation and mining in multidimensional time series," arXiv preprint arXiv:2409.14327, 2024. Available From: https://doi.org/10.48550/arXiv.2409.14327
[17] Y. Liang, Y. Zhang, Z. Ye, and Z. Chen, "Enhanced unsupervised image registration via dense U-Net and channel attention," J. Comput. Sci. Softw. Appl., vol. 4, no. 5, pp. 8-15, 2024. Available From: https://doi.org/10.5281/zenodo.13643091
[18] X. Yan, W. Wang, M. Xiao, Y. Li, and M. Gao, "Survival prediction across diverse cancer types using neural networks," Proc. 2024 7th Int. Conf. Mach. Vis. Appl., pp. 134-138, 2024. Available From: https://doi.org/10.48550/arXiv.2404.08713
[19] Z. Zheng, Y. Cang, W. Yang, Q. Tian, and D. Sun, "Named entity recognition: A comparative study of advanced pre-trained model," J. Comput. Technol. Softw., vol. 3, no. 5, 2024. Available From: https://doi.org/10.5281/zenodo.13624035
[20] I. P. Singh, E. Ghorbel, O. Oyedotun, et al., "Multi-label image classification using adaptive graph convolutional networks: From a single domain to multiple domains," Comput. Vis. Image Underst., vol. 247, p. 104062, 2024. Available From: https://doi.org/10.1016/j.cviu.2024.104062
[21] Y. Ma, D. Sun, E. Gao, N. Sang, I. Li, and G. Huang, "Enhancing deep learning with optimized gradient descent: Bridging numerical methods and neural network training," arXiv preprint arXiv:2409.04707, 2024. Available From: https://doi.org/10.48550/arXiv.2409.04707
Illinois Institute of Technology, Chicago, USA
No. of Downloads: 7 | No. of Views: 423
Mohankumar T P, D. Ramesh.
November 2024 - Vol 12, Issue 6
Anugrah Shailay, Swati Jadon, Ankush Sharma.
November 2024 - Vol 12, Issue 6
Mrinal Kumar, Mayur Prakashrao Gore.
November 2024 - Vol 12, Issue 6