Science-driven chemical machine learning
Johannes Margraf, Fritz-Haber-Institut, Germany
The overarching goal of my research is the establishment of a science-driven approach to chemical machine learning (ML). In many fields, ML is a fundamentally data-driven endeavor, meaning that specific databases and benchmark problems (i.e. big data) are at the center of methodological development. While this has certainly led to tremendous advances in recent years (e.g. in image generation and natural language processing), the full diversity and complexity of chemistry cannot be adequately represented by a few predefined databases. A major driver of my work is therefore the desire to build accurate, data-efficient models which do not require enormous reference datasets for training. This is because I want to be able to apply our methods to any problem of chemical interest, not just to those problems for which big data happens to be available. Several examples of this in the context of the atomistic simulation of energy materials will be discussed.