Jamal Shamsara* Pages 555 - 569 ( 15 )
Background: The Soluble Epoxide Hydrolase (sEH) is a ubiquitously expressed enzyme in various tissues. The inhibition of the sEH has shown promising results to treat hypertension, alleviate pain and inflammation.Objective: In this study, the power of machine learning has been employed to develop a predictive QSAR model for a large set of sEH inhibitors. Methods: In this study, the random forest method was employed to make a valid model for the prediction of sEH inhibition. Besides, two new methods (Treeinterpreter python package and LIME, Local Interpretable Model-agnostic Explanations) have been exploited to explain and interpret the model. Results: The performance metrics of the model were as follows: R2=0.831, Q2=0.565, RMSE=0.552 and R2 pred=0.595. The model also demonstrated good predictability on the two extra external test sets at least in terms of ranking. The Spearman’s rank correlation coefficients for external test set 1 and 2 were 0.872 and 0.673, respectively. The external test set 2 was a diverse one compared to the training set. Therefore, the model could be used for virtual screening to enrich potential sEH inhibitors among a diverse compound library. Conclusion: As the model was solely developed based on a set of simple fragmental descriptors, the model was explained by two local interpretation algorithms, and this could guide medicinal chemists to design new sEH inhibitors. Moreover, the most important general descriptors (fragments) suggested by the model were consistent with the available crystallographic data. The model is available as an executable binary at http://www.pharm-sbg.com and https://github.com/shamsaraj.
Cheminformatics, machine learning, QSAR, random forest, soluble epoxide hydrolase, virtual screening.
Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad