AI model develops dual-target drug candidates using chemical language processing
Researchers at the University of Bonn have developed an artificial intelligence system that can predict chemical structures of compounds capable of binding to multiple protein targets, potentially streamlining the discovery of more effective therapeutic agents.
A team led by Professor Jürgen Bajorath at the University of Bonn has adapted language model architecture, similar to that used in large language models, to process and generate chemical structures. Their system was specifically trained to identify compounds that could simultaneously interact with two different protein targets, a property known as polypharmacology that is particularly valuable in drug development.
The research, published in Cell Reports Physical Science on 23 October 2024 [1], demonstrates how the model can be trained using SMILES notation – a string-based representation of chemical structures – to learn the subtle structural differences between single-target and dual-target compounds.
Training methodology
The researchers fed their chemical language model with more than 70,000 pairs of SMILES strings, where each pair consisted of a compound known to act on a single target protein and another compound known to affect both that protein and an additional target. This training approach enabled the model to develop an understanding of the structural features that contribute to dual-target activity.
“In pharmaceutical research, these types of active compounds are highly desirable due to their polypharmacology,” explained Professor Bajorath, who heads the AI in Life Sciences area at the Lamarr Institute for Machine Learning and Artificial Intelligence. “Because compounds with desirable multi-target activity influence several intracellular processes and signalling pathways at the same time, they are often particularly effective – such as in the fight against cancer.”
Advantages over combination therapy
While similar therapeutic effects might be achieved through the co-administration of multiple drugs, this approach often presents challenges related to drug-drug interactions and varying pharma-cokinetics. The development of single compounds with dual-target activity could potentially overcome these limitations.
Following the initial training phase, the researchers performed additional fine-tuning using several dozen specially selected compound pairs. This step was crucial for teaching the model to identify compounds that could target proteins from different functional classes, rather than just similar proteins.
Novel structural insights
One of the most significant outcomes of the research was the model’s ability to suggest unexpected chemical structures. “It is more interesting, from my point of view, that the AI often suggests chemical structures that most chemists would not even think of right away,” noted Prof. Bajorath. “To a certain extent, it triggers ‘out of the box’ ideas and comes up with original solutions that can lead to new design hypotheses and approaches.”
The model’s effectiveness was demonstrated through its ability to reproduce known dual-target compounds, confirming the validity of the approach. While the immediate discovery of superior pharmaceuticals may not be the primary outcome, the system’s capacity to generate novel structural suggestions could prove valuable in expanding the chemical space explored during drug development.
Future implications
This research represents a significant step forward in the application of artificial intelligence to drug discovery, particularly in the challenging area of multi-target drug design. The approach could accelerate the identification of lead compounds with desired polypharmacological profiles, though further validation and refinement of the model will be necessary.
The work was conducted at the University of Bonn’s Lamarr Institute and the Bonn-Aachen International Center for Information Technology (b-it), combining expertise in computational chemistry and artificial intelligence.
Reference:
1. Srinivasan, S., & Bajorath, J. (2024). Generation of dual-target compounds using a transformer chemical language model. Cell Reports Physical Science. https://doi.org/10.1016/j.xcrp.2024.102255
Three-dimensional structures of two target proteins, histone deacetylase 6 (blue) and tyrosine-protein kinase JAK2 (red), together with a selective inhibitor of each enzyme. The dual inhibitor in the centre is active against both targets. The prediction of compounds with predefined dual-target activity is the task of the chemical language model. (Figure: Sanjana Srinivasan & Jürgen Bajorath)