209

views

Comment

recommends

Review: found

Is Open Access

Review of 'Explainable drug repurposing via path-based knowledge graph completion'

Reviewer: Markus List

Inviter role(s): EDITOR

Publication date of review: 2024-01-18

Bookmark

Markus List4

Explainable drug repurposing via path-based knowledge graph completionCrossref ScienceOpen

XG4Repo predicts new paths in knowledge graphs to propose new indications for existing drugs

Average rating:	    Rated 3.5 of 5.
Level of importance:	    Rated 3 of 5.
Level of validity:	    Rated 3 of 5.
Level of completeness:	    Rated 4 of 5.
Level of comprehensibility:	    Rated 4 of 5.
Competing interests:	None

Reviewed article

Record: found
Abstract: found
Article: found

Is Open Access

Explainable drug repurposing via path-based knowledge graph completion

Ana Jimenez, María José Merino, Juan Parras … (2023)

Drug repurposing aims to find new therapeutic applications for existing drugs in the pharmaceutical market, leading to significant savings in time and cost. The use of artificial intelligence and knowledge graphs to propose repurposing candidates facilitates the process, as large amounts of data can be processed. However, it is important to pay attention to the explainability needed to validate the predictions. We propose a general architecture to understand several explainable methods for graph completion based on knowledge graphs and design our own architecture for drug repurposing. We present XG4Repo (eXplainable Graphs for Repurposing), a framework that takes advantage of the connectivity of any biomedical knowledge graph to link compounds to the diseases they can treat. Our method allows methapaths of different types and lengths, which are automatically generated and optimised based on data. XG4Repo focusses on providing meaningful explanations to the predictions, which are based on paths from compounds to diseases. These paths include nodes such as genes, pathways, side effects, or anatomies, so they provide information about the targets and other characteristics of the biomedical mechanism that link compounds and diseases. Paths make predictions interpretable for experts who can validate them and use them in further research on drug repurposing. We also describe three use cases where we analyse new uses for Epirubicin, Paclitaxel, and Predinisone and present the paths that support the predictions.

0 comments Cited 0 times – based on 0 reviews

Preprint Version

     Review now

Bookmark

Review information

DOI:: 10.14293/S2199-1006.1.SOR-LIFE.ADBQSG.v1.RUOSGN

License:

This work has been published open access under Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com.

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: Interpretability,Hetionet,Heterogeneous Knowledge Graphs,Rule-based link prediction,Drug Repurposing,Knowledge Graph Completion

Review text

Jimenez et al. present a new method for path-based knowledge graph completion applied to drug repurposing. XG4Repo suggests new paths on the Hetionet data set and is one of the few knowledge graph based tools allowing queries to link drugs to diseases that can be treated with them. The method is evaluated by splitting drug - disease associations into train, test and validation set. The performance is then evaluated based on the mean reciprocal rank of the candidates in the test set and compared to other state of the art tools. The use of XG4Repo is further exemplified for three drugs, where medical literature evidence is presented for supporting the predictions. The manuscript is generally well written and offers mostly sufficient details. More specific comments are found below.

General:

Some paragraphs are a bit difficult to parse, consider revising the manuscript for clarity and correct grammar.
Introduction rather short
The research gap is not very well described. The section "Our contribution" leaves it open e.g. why a general architecture is currently missing or where the authors see a gap in interpretability w.r.t. current methodology.
Figures are not shown in the order they are first mentioned in the paper, e.g. Figure 11.

Introduction:

Examples for graph based drug repurposing are given but not described, e.g. ref. 7-9 are not mentioned.
"These methods cannot capture multistep relations" - please explain in more detail what you mean by that.
"Most of them are based on paths that relate drugs and diseases through the nodes of the graph". It remains very vague what those paths look like, i.e. how are drugs and diseases linked. Via shared genes / proteins or do you refer to other types of nodes? How does such a path represent a biological explanation? Please elaborate further here.
Similarly, it is not clear how metapaths differ from the aforementioned paths in this context without further details.
"Metapaths that are known to be useful" - known how?
"Other methods do the opposite, evaluating every possible metapath" - please elaborate in more detail what this evaluation could look like.
"disease modules are used to predict a list of drug candidates to treat a certain disease". It is unclear how this is achieved and what additional data is needed for this.
"They generated disease modules using network-based medical algorithms". This should be explained earlier in the paragraph (before switching to drug candidate prediction). Also, I think you want to refer to network medicine algorithms rather than network-based medical algorithms.
"apply some technique to provide explanations." this is again very ambiguous, please elaborate what those techniques are.

Data:

Data was split into training, test and validation but the splitting was not further described. What measures were taken to avoid data leakage. Did the author consider cold splits where a drug / disease is contained either entirely in the training or in the test set?

Evaluation:

The evaluation metric is not properly defined here.

Results:

An articulated strength of the xg4repo approach is interpretability. While the manuscript compares predictions with the medical literature and related databases such as DrugBank, a functional link of drug and disease that explains the prediction is not discussed here. As the authors argue that this is a strength of their approach this should be improved. For example, epirubicin is a topisomerase-2 inhibitor which leads to double streand breaks in the DNA. The mechanistic link of the drug and the suggested diseases is not obvious at all from the proposed paths and rules.
Limitations of this approach are not sufficiently discussed. For example, the evaluation here is based on mean reciprocal rank which allows relative comparisons between methods (good methods will rank ground truth diseases higher). However, this is no indication that the results are necessarily trustworthy as suggested in the paper. This is because the number of false positive predictions can not be assessed. In my opinion, this limitation needs to be emphasized more clearly. Another limitation is that the knowledge graph can only indirectly leverage molecular associations which have been previously observed, while methods employing molecular data could validate rules. The approach presented here is also not suited for personalized or precision medicine, as the patient characteristics are not taken into account.

Code:

The code for XG4Repo is available on github but it lacks any sort of usage documentation and does not even provide a basic readme or usage examples. Furthermore, no open source license has been added making it hard to use this code for further research.

Comments

Comment on this review

Version and Review History

Published Version

Preprint Version

Reviewed by Fernando Miguel Delgado-Chaves Reviewed by Markus List