Bluelight

Thread: Predicting Structure-Activity Relationships through Data Mining/Predictice Analytics

Results 1 to 2 of 2
  1. Collapse Details
    Predicting Structure-Activity Relationships through Data Mining/Predictice Analytics 
    #1
    Hello N&PD,

    This might have come up as a topic of discussion before but I haven't seen this addressed in a while. In the last 20 years machine learning has become more mainstream, in part due to advances in the theoretical understanding, computational power and general acceptance of the versatility of these methods for discovering hidden relationships in large data sets. I recently built a neural network (NN) and have been playing a lot with different configurations and am growing pretty excited about their ability to classify data in real-world complex problems where intuition can fail. Some of the recent achievements in this field are hard to ignore.

    For example, deep learning techniques have been employed by google to train computers to learn the rules of chess and subsequently beat the world champion chess program. The world champion chess program Stockfish 8 has been developed over 20 year. (Read more: alphazero google deepmind ai beats champion program teaching itself to play in four hours). Other recent headlines related to deep learning show that facial recognition software is now advanced enough that it can help find a fugitive in a crowd of 600,000 (Read more: chinese facial recognition recognizes wanted man in crowd of 60000/). The question occurred to me as to whether these deep learning techniques could be used to make accurate predictions of structure-activity relationships for new molecules using a database of existing molecule binding affinities to predict the binding affinities of molecules for which data has not been experimentally determined.

    By no means is this a new area of inquiry. Researchers have been looking at these kinds of questions for decades. In the past NN's have lost favor with researchers as a viable strategy for data mining in favor of other techniques like random forest and support vector machines (SVM). Only recently have deep neural networks (with more than one hidden layer of artificial neurons) become viable for these type of research questions and recent reports have shown that they can actually do a better job than other techniques used in data mining.

    For example, a recent publication https://pubs.acs.org/doi/abs/10.1021/ci500747n shows that deep NN's can begin to outperform more established techniques for data mining.

    Anyways, I'm relatively new to machine learning and I'm not a medicinal chemist, so I don't have strong views on this but I've been very impressed with the ability of neural networks to make predictions that defy intuition in complicated data sets. I know the intuition of a skilled medicinal chemist is often hard to beat when proposing candidates for new drugs, but I'm wondering if in the future data mining might dramatically change the way drug discovery is done and what some of the most promising techniques for realizing this might be. I know binding affinities can be calculated ab-initio in some cases with good results. This is very different from data mining however because with data mining the physics and chemistry of the molecule is not required to make predictions. Rather, the complicated SAR is discovered from the data without need of physical intuition. The results of this data mining might then inform refinements in detailed physical models. How might these compare with the results of NN's or other data mining techniques when comparing the accuracy of binding affinities?

    I don't want to make this an overly restrictive discussion. I look forward to hearing from people who have insights, opinions, open questions and relevant papers related to data mining techniques in the drug discovery pipeline and how they might shape the future of drug discovery.
    Last edited by levels; 16-04-2018 at 22:15. Reason: fixed links/ invite papers
    Reply With Quote
     

  2. Collapse Details
     
    #2
    In-silico models alone have not got a great record. They are of far more value to the old fashioned rational design. Even recently, people have made some great discoveries (in established fields) using CHARMM. (Chemistry at HARvard Macromolecular Mechanics) which is free to students and non-profit organizations. J Mol Model (2011) 17:477–493 is a CLASSIC example. Of course, since then we have made strides into more subtypes but with the appropriate training sets it's quite good at spotting an active. For less mature fields, I think others will know better than I.
    Reply With Quote
     

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •