MIT unveils method to explain protein AI models' predictions

In a significant stride toward demystifying artificial intelligence in biology, researchers at MIT have unveiled a novel method to peer into the inner workings of protein language models. These advanced AI systems, akin to the large language models (LLMs) that power tools like ChatGPT, have become indispensable in recent years for predicting protein structures and functions, aiding in tasks from identifying potential drug targets to designing therapeutic antibodies. While remarkably accurate, their decision-making processes have remained largely opaque—a “black box” phenomenon that has limited researchers’ ability to fully leverage their potential.

The new study, led by MIT graduate student Onkar Gujral and senior author Bonnie Berger, a professor of mathematics and head of the Computation and Biology group at MIT’s Computer Science and Artificial Intelligence Laboratory, offers a critical breakthrough. By illuminating the specific features these models consider when making predictions, the research promises to help scientists select more effective models for particular applications, thereby streamlining the development of new drugs and vaccine candidates. As Berger emphasizes, this work has broad implications for enhancing the interpretability of AI systems crucial for downstream biological applications and could even uncover novel biological insights. The findings are published in the Proceedings of the National Academy of Sciences.

Protein language models operate on principles similar to their text-based counterparts. Instead of analyzing words, they process vast amounts of amino acid sequences, learning patterns that enable them to predict protein characteristics. For instance, Berger’s earlier work in 2021 used such a model to pinpoint sections of viral surface proteins less prone to mutation, identifying potential vaccine targets against influenza, HIV, and SARS-CoV-2. However, the exact mechanisms behind these predictions remained a mystery.

To crack open this computational “black box,” the MIT team employed a technique known as a sparse autoencoder, a type of algorithm recently used to shed light on traditional LLMs. Proteins within a neural network are typically represented by patterns of activation across a constrained number of “nodes” or “neurons”—analogous to how the brain stores information. For example, a protein might be represented by 480 such nodes. A sparse autoencoder dramatically expands this representation, stretching it across a much larger number of nodes, perhaps 20,000. This expansion, combined with a “sparsity constraint,” allows the information to spread out, ensuring that a feature previously encoded by multiple nodes can now occupy a single, dedicated node. This makes the activation of each individual node far more meaningful and interpretable.

Once these sparse representations were generated, the researchers utilized an AI assistant, Claude, to analyze them. Claude compared the newly explicit representations with known protein features—such as molecular function, family, or cellular location. Through this analysis of thousands of representations, Claude could identify which specific nodes corresponded to particular protein characteristics and describe them in clear, understandable language. For example, the AI might report that a certain neuron detects proteins involved in transmembrane transport of ions or amino acids, particularly those found in the plasma membrane. The study revealed that protein family and various metabolic and biosynthetic processes were among the features most frequently encoded by these newly interpretable nodes.

This newfound ability to understand what features a protein model is prioritizing opens up exciting possibilities. Researchers can now more intelligently choose or fine-tune models for specific research questions, optimizing their input to achieve superior results. Furthermore, as these models continue to advance in power and sophistication, the capacity to dissect their internal logic holds the promise of uncovering entirely new biological principles, pushing the boundaries of our current understanding of proteins and life itself.

MIT unveils method to explain protein AI models' predictions

Related Articles

LLM 'Chain-of-Thought' is brittle pattern-matching, not true reasoning

Psychiatrists Warn AI Chatbots Cause Severe Mental Health Issues

MIT: 95% of corporate genAI projects fail due to poor integration

Related Articles

▸
LLM 'Chain-of-Thought' is brittle pattern-matching, not true reasoning

▸
Psychiatrists Warn AI Chatbots Cause Severe Mental Health Issues

▸
MIT: 95% of corporate genAI projects fail due to poor integration