27
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Large-scale Semantic Integration of Linked Data : A Survey

      1 , 1
      ACM Computing Surveys
      Association for Computing Machinery (ACM)

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A large number of published datasets (or sources) that follow Linked Data principles is currently available and this number grows rapidly. However, the major target of Linked Data, i.e., linking and integration, is not easy to achieve. In general, information integration is difficult, because (a) datasets are produced, kept, or managed by different organizations using different models, schemas, or formats, (b) the same real-world entities or relationships are referred with different URIs or names and in different natural languages,<?brk?>(c) datasets usually contain complementary information, (d) datasets can contain data that are erroneous, out-of-date, or conflicting, (e) datasets even about the same domain may follow different conceptualizations of the domain, (f) everything can change (e.g., schemas, data) as time passes. This article surveys the work that has been done in the area of Linked Data integration, it identifies the main actors and use cases, it analyzes and factorizes the integration process according to various dimensions, and it discusses the methods that are used in each step. Emphasis is given on methods that can be used for integrating several datasets. Based on this analysis, the article concludes with directions that are worth further research.

          Related collections

          Most cited references154

          • Record: found
          • Abstract: found
          • Article: not found

          Efficient Estimation of Word Representations in Vector Space

          We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Linked Data - The Story So Far

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Freebase

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                ACM Computing Surveys
                ACM Comput. Surv.
                Association for Computing Machinery (ACM)
                0360-0300
                1557-7341
                September 30 2020
                September 30 2020
                : 52
                : 5
                : 1-40
                Affiliations
                [1 ]Institute of Computer Science, FORTH-ICS, Greece 8 Computer Science Department, University of Crete, Crete, Greece
                Article
                10.1145/3345551
                a8e09ea6-1bb2-4150-a1b2-2a112843356f
                © 2020

                http://www.acm.org/publications/policies/copyright_policy#Background

                History

                Comments

                Comment on this article