The use of computational methods in drug repurposing has become increasingly popular, and innovative artificial intelligence techniques have proven to be valuable tools for predicting new therapeutical opportunities of existing drugs. Within this framework, biomedical data can be represented as graphs, offering a highly expressive way to depict the underlying structure of the information. Integrating these graph-based data structures with deep learning models enhances the prediction of novel connections, such as potential associations between diseases and drugs, which can be regarded as repurposing possibilities.
In particular, we have employed the heterogeneous integrated biomedical information from the DISNET project ( https://disnet.ctb.upm.es) [1], which compiles a set of different structured and unstructured data that revolve around diseases, and which is based in the concepts of the Human Disease Network (HDN) [2]. This information is represented as a multilayered complex network whose node types, in addition to diseases, include symptoms, genes, proteins, genetic variants, non-coding RNAs, biological pathways, drugs, and so on. And the links represent the connection between diseases and genes, diseases and drugs, the interactions between proteins (Protein-Protein Interactions – PPIs), and the interactions between drugs (Drug-Drug Interactions – DDIs), among others. We consider two derived graphs: a simplified version of the DISNET knowledge base that we name simple and its complete version that we name complex. The simple graph contains a limited set of information: diseases, symptoms, drugs and their relationships. The complex graph uses all the available information. Node types are “phenotype”, “drug”, “pathway”, “protein” and “drug-drug-interaction”; and link types are Disease-Drug (therapeutic) (“dis_dru_the”), Disease-Symptom (“dse_sym”), Disease-Protein (“dis_pro”), Disease-Pathway (“dis_pat”), Drug-Drug (“druA_druB”), Drug-Protein (“dru_pro”), Drug-Symptom (side effect) (“dru_sym_sef”), Drug-Symptom (indication) (“dru_sym_ind”), Protein-Protein (“proA_proB”), Protein-Pathway (“pro_pat”), DDI-Phenotype (“ddi_phe”), and DDI-Drug (“ddi_dru”).
From these two different networks, we have developed several models based on Graph Neural Network (GNN) pipelines to embed the information in the network (representing each node as a vector of features) to then decode these embedding vectors optimizing the prediction of new links of the type disease-drug. Firstly, REDIRECTION (dRug rEpurposing Disnet lInk pREdiCTION) was presented [3]. REDIRECTION is a model that was trained on the simple graph, which was developed under this encoder-decoder framework to predict ‘dis_dru_the’ links. Afterwards, we presented DMSR (Drug Molecular Structure REDIRECTION), which incorporated drug molecular structures as the initial drug nodes’ features in REDIRECTION [4]. This model also used GNNs to embed two-dimensional drug molecular structures and obtain the corresponding features. DMSR was originally trained on the simple graph, improving REDIRECTION accuracy. We also presented BEHOR (Bedirectional Edge and Hyperparameter Optimized REDIRECTION). In it, links were made undirected, hyperparameter optimisation took place and the complex graph was used [5]. And finally, we also developed a version of DMSR with the complex graph. The best accuracy for predicting new repurposing opportunities was obtained in this last model.