25
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Unlocking Data for Statistical Analyses and Data Mining: Generic Case Extraction of Clinical Items from i2b2 and tranSMART.

      Read this article at

      ScienceOpenPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In medical science, modern IT concepts are increasingly important to gather new findings out of complex diseases. Data Warehouses (DWH) as central data repository systems play a key role by providing standardized, high-quality and secure medical data for effective analyses. However, DWHs in medicine must fulfil various requirements concerning data privacy and the ability to describe the complexity of (rare) disease phenomena. Here, i2b2 and tranSMART are free alternatives representing DWH solutions especially developed for medical informatics purposes. But different functionalities are not yet provided in a sufficient way. In fact, data import and export is still a major problem because of the diversity of schemas, parameter definitions and data quality which are described variously in each single clinic. Further, statistical analyses inside i2b2 and tranSMART are possible, but restricted to the implemented functions. Thus, data export is needed to provide a data basis which can be directly included within statistics software like SPSS and SAS or data mining tools like Weka and RapidMiner. The standard export tools of i2b2 and tranSMART are more or less creating a database dump of key-value pairs which cannot be used immediately by the mentioned tools. They need an instance-based or a case-based representation of each patient. To overcome this lack, we developed a concept called Generic Case Extractor (GCE) which pivots the key-value pairs of each clinical fact into a row-oriented format for each patient sufficient to enable analyses in a broader context. Therefore, complex pivotisation routines where necessary to ensure temporal consistency especially in terms of different data sets and the occurrence of identical but repeated parameters like follow-up data. GCE is embedded inside a comprehensive software platform for systems medicine.

          Related collections

          Author and article information

          Journal
          Stud Health Technol Inform
          Studies in health technology and informatics
          0926-9630
          0926-9630
          2016
          : 228
          Affiliations
          [1 ] Institute of Medical Biometry and Informatics, Heidelberg University, Germany.
          [2 ] Translational Research Unit, Thoraxklinik at University Hospital Heidelberg, Heidelberg.
          Article
          27577447
          007607fc-ac89-4a6e-ad4f-5197e93d9abb
          History

          Comments

          Comment on this article