11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Prediction of KRASG12C inhibitors using conjoint fingerprint and machine learning-based QSAR models.

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Kirsten rat sarcoma virus G12C (KRASG12C) is the major protein mutation associated with non-small cell lung cancer (NSCLC) severity. Inhibiting KRASG12C is therefore one of the key therapeutic strategies for NSCLC patients. In this paper, a cost-effective data driven drug design employing machine learning-based quantitative structure-activity relationship (QSAR) analysis was built for predicting ligand affinities against KRASG12C protein. A curated and non-redundant dataset of 1033 compounds with KRASG12C inhibitory activity (pIC50) was used to build and test the models. The PubChem fingerprint, Substructure fingerprint, Substructure fingerprint count, and the conjoint fingerprint-a combination of PubChem fingerprint and Substructure fingerprint count-were used to train the models. Using comprehensive validation methods and various machine learning algorithms, the results clearly showed that the XGBoost regression (XGBoost) achieved the highest performance in term of goodness of fit, predictivity, generalizability and model robustness (R2 = 0.81, Q2CV = 0.60, Q2Ext = 0.62, R2 - Q2Ext = 0.19, R2Y-Random = 0.31 ± 0.03, Q2Y-Random = -0.09 ± 0.04). The top 13 molecular fingerprints that correlated with the predicted pIC50 values were SubFPC274 (aromatic atoms), SubFPC307 (number of chiral-centers), PubChemFP37 (≥1 Chlorine), SubFPC18 (Number of alkylarylethers), SubFPC1 (number of primary carbons), SubFPC300 (number of 1,3-tautomerizables), PubChemFP621 (N-C:C:C:N structure), PubChemFP23 (≥1 Fluorine), SubFPC2 (number of secondary carbons), SubFPC295 (number of C-ONS bonds), PubChemFP199 (≥4 6-membered rings), PubChemFP180 (≥1 nitrogen-containing 6-membered ring), and SubFPC180 (number of tertiary amine). These molecular fingerprints were virtualized and validated using molecular docking experiments. In conclusion, this conjoint fingerprint and XGBoost-QSAR model demonstrated to be useful as a high-throughput screening tool for KRASG12C inhibitor identification and drug design.

          Related collections

          Author and article information

          Journal
          J Mol Graph Model
          Journal of molecular graphics & modelling
          Elsevier BV
          1873-4243
          1093-3263
          Jul 2023
          : 122
          Affiliations
          [1 ] Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand. Electronic address: tarasri@kku.ac.th.
          [2 ] Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
          [3 ] Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
          Article
          S1093-3263(23)00064-5
          10.1016/j.jmgm.2023.108466
          37058997
          e516c958-eb2f-4e38-99bf-9d9953e63ddc
          History

          Drug design,XGBoost,Support vector regression,Random forest,QSAR,Machine learning,KRAS,Deep neural network

          Comments

          Comment on this article