Binding between proteins and small molecules plays a crucial role in predicting various biological processes, including metabolic reactions, regulatory mechanisms, and signal transduction pathways. However, binding information is limited, covering only a small fraction of the potential protein-molecule combinations. For example, in the Foodome project, we have data on over 135,000 food-related small molecules, yet only 4.58% of these molecules have known binding annotations with proteins1,2,3. Although deep learning models have been suggested to expedite this identification process, our research demonstrates that state-of-the-art models struggle to generalize to unfamiliar molecular structures, often learning unintended shortcuts.
In this presentation, I will introduce AI-Bind4, a novel pipeline that combines network-based sampling strategies with unsupervised pre-training, enhancing binding predictions for proteins and ligands with insufficient or no prior annotations, exemplified by food-related small molecules. AI-Bind predictions were validated via docking simulations and comparison with recent experimental evidence, offering a framework to interpret machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence.
Menichetti Giulia. An AI pipeline to investigate the binding properties of poorly annotated molecules. Nature Reviews Physics. Vol. 4(6)2022. Springer Science and Business Media LLC. [Cross Ref]
Nasirian Farzaneh, Menichetti Giulia. Molecular Interaction Networks and Cardiovascular Disease Risk: The Role of Food Bioactive Small Molecules. Arteriosclerosis, Thrombosis, and Vascular Biology. Vol. 43(6):813–823. 2023. Ovid Technologies (Wolters Kluwer Health). [Cross Ref]
Menichetti Giulia, Barabási Albert-László. Nutrient concentrations in food display universal behaviour. Nature Food. Vol. 3(5):375–382. 2022. Springer Science and Business Media LLC. [Cross Ref]
Chatterjee Ayan, Walters Robin, Shafi Zohair, Ahmed Omair Shafi, Sebek Michael, Gysi Deisy, Yu Rose, Eliassi-Rad Tina, Barabási Albert-László, Menichetti Giulia. Improving the generalizability of protein-ligand binding predictions with AI-Bind. Nature Communications. Vol. 14(1)2023. Springer Science and Business Media LLC. [Cross Ref]