3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Graphic association learning: Multimodal feature extraction and fusion of image and text using artificial intelligence techniques

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          With the advancement of technology in recent years, the application of artificial intelligence in real life has become more extensive. Graphic recognition is a hot spot in the current research of related technologies. It involves machines extracting key information from pictures and combining it with natural language processing for in-depth understanding. Existing methods still have obvious deficiencies in fine-grained recognition and deep understanding of contextual context. Addressing these issues to achieve high-quality image-text recognition is crucial for various application scenarios, such as accessibility technologies, content creation, and virtual assistants. To tackle this challenge, a novel approach is proposed that combines the Mask R-CNN, DCGAN, and ALBERT models. Specifically, the Mask R-CNN specializes in high-precision image recognition and segmentation, the DCGAN captures and generates nuanced features from images, and the ALBERT model is responsible for deep natural language processing and semantic understanding of this visual information. Experimental results clearly validate the superiority of this method. Compared to traditional image-text recognition techniques, the recognition accuracy is improved from 85.3% to 92.5%, and performance in contextual and situational understanding is enhanced. The advancement of this technology has far-reaching implications for research in machine vision and natural language processing and open new possibilities for practical applications.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: not found
          • Book: not found

          2009 IEEE conference on computer vision and pattern recognition

            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            Proceedings of the IEEE International Conference on Computer Vision

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A Novel Combined Prediction Scheme Based on CNN and LSTM for Urban PM2.5 Concentration

                Bookmark

                Author and article information

                Contributors
                Journal
                Heliyon
                Heliyon
                Heliyon
                Elsevier
                2405-8440
                30 August 2024
                30 September 2024
                30 August 2024
                : 10
                : 18
                : e37167
                Affiliations
                [a ]College of Information Science and Engineering, Liuzhou Institute of Technology, 545616, Liuzhou, Guangxi, China
                [b ]College of automotive Engineering, Liuzhou Institute of Technology, 545616, Liuzhou, Guangxi, China
                Author notes
                Article
                S2405-8440(24)13198-2 e37167
                10.1016/j.heliyon.2024.e37167
                11417159
                39315129
                57a62d06-ecb5-4302-98fb-ca3812d62c24
                © 2024 The Authors

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 23 April 2024
                : 16 August 2024
                : 28 August 2024
                Categories
                Research Article

                text matching,image matching,albert,mask r-cnn,dcgan,multimodal feature,graphic association

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content88

                Most referenced authors1,431