68
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      netDx: Software for building interpretable patient classifiers by multi-'omic data integration using patient similarity networks

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Patient classification based on clinical and genomic data will further the goal of precision medicine. Interpretability is of particular relevance for models based on genomic data, where sample sizes are relatively small (in the hundreds), increasing overfitting risk netDx is a machine learning method to integrate multi-modal patient data and build a patient classifier. Patient data are converted into networks of patient similarity, which is intuitive to clinicians who also use patient similarity for medical diagnosis. Features passing selection are integrated, and new patients are assigned to the class with the greatest profile similarity. netDx has excellent performance, outperforming most machine-learning methods in binary cancer survival prediction. It handles missing data – a common problem in real-world data – without requiring imputation. netDx also has excellent interpretability, with native support to group genes into pathways for mechanistic insight into predictive features.

          The netDx Bioconductor package provides multiple workflows for users to build custom patient classifiers. It provides turnkey functions for one-step predictor generation from multi-modal data, including feature selection over multiple train/test data splits. Workflows offer versatility with custom feature design, choice of similarity metric; speed is improved by parallel execution. Built-in functions and examples allow users to compute model performance metrics such as AUROC, AUPR, and accuracy. netDx uses RCy3 to visualize top-scoring pathways and the final integrated patient network in Cytoscape. Advanced users can build more complex predictor designs with functional building blocks used in the default design. Finally, the netDx Bioconductor package provides a novel workflow for pathway-based patient classification from sparse genetic data.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          Cytoscape: a software environment for integrated models of biomolecular interaction networks.

          Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found

            Hallmarks of Cancer: The Next Generation

            The hallmarks of cancer comprise six biological capabilities acquired during the multistep development of human tumors. The hallmarks constitute an organizing principle for rationalizing the complexities of neoplastic disease. They include sustaining proliferative signaling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing angiogenesis, and activating invasion and metastasis. Underlying these hallmarks are genome instability, which generates the genetic diversity that expedites their acquisition, and inflammation, which fosters multiple hallmark functions. Conceptual progress in the last decade has added two emerging hallmarks of potential generality to this list-reprogramming of energy metabolism and evading immune destruction. In addition to cancer cells, tumors exhibit another dimension of complexity: they contain a repertoire of recruited, ostensibly normal cells that contribute to the acquisition of hallmark traits by creating the "tumor microenvironment." Recognition of the widespread applicability of these concepts will increasingly affect the development of new means to treat human cancer. Copyright © 2011 Elsevier Inc. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              Comprehensive molecular portraits of human breast tumors

              Summary We analyzed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, mRNA arrays, microRNA sequencing and reverse phase protein arrays. Our ability to integrate information across platforms provided key insights into previously-defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at > 10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the Luminal A subtype. We identified two novel protein expression-defined subgroups, possibly contributed by stromal/microenvironmental elements, and integrated analyses identified specific signaling pathways dominant in each molecular subtype including a HER2/p-HER2/HER1/p-HER1 signature within the HER2-Enriched expression subtype. Comparison of Basal-like breast tumors with high-grade Serous Ovarian tumors showed many molecular commonalities, suggesting a related etiology and similar therapeutic opportunities. The biologic finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biologic subtypes of breast cancer.
                Bookmark

                Author and article information

                Contributors
                Role: SoftwareRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing
                Role: Formal AnalysisRole: Software
                Role: ConceptualizationRole: MethodologyRole: Software
                Role: ConceptualizationRole: MethodologyRole: Software
                Role: MethodologyRole: SoftwareRole: Writing – Review & Editing
                Role: MethodologyRole: Software
                Role: MethodologyRole: SoftwareRole: Writing – Original Draft PreparationRole: Writing – Review & Editing
                Role: MethodologyRole: SupervisionRole: Writing – Review & Editing
                Role: Software
                Role: Supervision
                Role: ConceptualizationRole: MethodologyRole: ResourcesRole: SupervisionRole: Writing – Review & Editing
                Journal
                F1000Res
                F1000Res
                F1000Research
                F1000Research
                F1000 Research Limited (London, UK )
                2046-1402
                22 January 2021
                2020
                : 9
                : 1239
                Affiliations
                [1 ]The Donnelly Centre, University of Toronto, Toronto, Canada
                [2 ]Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
                [3 ]Department of Computer Science, University of Verona, Verona, Italy
                [4 ]The Bioinformatics Centre, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
                [5 ]H. Lundbeck A/S, Copenhagen, Denmark
                [6 ]TUM School of Life Sciences Wiehenstephan, Technical University of Munich, Munich, Germany
                [7 ]Department of Molecular Genetics, University of Toronto, Toronto, Canada
                [8 ]Department of Computer Science, University of Toronto, Toronto, Canada
                [9 ]The Lunenfeld-Tanenbaum Research Institute, Mount Sinal Hospital, Toronto, Canada
                [1 ]University of Melbourne, Melbourne, Australia
                [1 ]INSERM, MMG, Aix Marseille University, Marseille, France
                [2 ]I2M, Institut de Mathématiques,, Aix Marseille University, Marseille, France
                [1 ]INSERM, MMG, Aix Marseille University, Marseille, France
                [2 ]I2M, Institut de Mathématiques,, Aix Marseille University, Marseille, France
                University of Toronto, Canada
                [1 ]University of Melbourne, Melbourne, Australia
                University of Toronto, Canada
                Author notes

                No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: We interacted with the authors during the review process to correct warning and errors obtained while executing the protocols of the 4 use cases

                Competing interests: We have no competing interests to declare.

                Competing interests: No competing interests were disclosed.

                Competing interests: We have no competing interests to declare.

                Author information
                https://orcid.org/0000-0002-1048-581X
                https://orcid.org/0000-0003-3101-6817
                https://orcid.org/0000-0002-6805-2080
                https://orcid.org/0000-0002-2243-2010
                https://orcid.org/0000-0003-0185-8861
                Article
                10.12688/f1000research.26429.2
                7883323
                33628435
                0939b8d7-8354-4567-8cdf-5d03a3ec63ce
                Copyright: © 2021 Pai S et al.

                This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 4 January 2021
                Funding
                Funded by: Horizon 2020
                Award ID: 777111
                Funded by: National Institutes of Health
                Award ID: R01HG009979
                Award ID: P41GM103504
                Funded by: Villum Fonden
                Award ID: 13154
                This work was supported by the U.S. National Institutes of Health grant number P41 GM103504 (NRNB) and R01 HG009979 (Cytoscape). JB and PW received financial support from JB's Villum Young Investigator Grant nr. 13154. Part of JB's work was also funded by H2020 project RepoTrial (nr. 777111).
                The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Software Tool Article
                Articles

                precision medicine,networks,classification,supervised learning,genomics,data integration

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content131

                Cited by3

                Most referenced authors1,344