The algorithm can identify new materials faster and cuts CPU hours by over 500,000.

Tech giant IBM has developed a new algorithm that is capable of identifying and selecting which simulations are the most optimal to run using Bayesian optimization machine learning. The algorithm was created in collaboration with the University of Liverpool to assist researchers in identifying new materials for gas storage. According to the company, it can cut CPU hours (CPUh) by over 500,000 hours in costly simulation campaigns and will help scientists perform materials discovery more efficiently. To achieve this, the algorithm enables the computer to direct resources to the most relevant simulations—ultimately saving computational power on simulation campaigns over time.
The work conducted for the development of the algorithm is part of IBM’s Bayesian optimization family of algorithms. The Bayesian method is a machine learning technique typically used for complex optimization problems. The Bayes theorem aims to determine the probability of an event based on knowledge of conditions that could potentially influence or be related to the event. To do this, it creates a probabilistic model of the surrogate or objective function that is searched through the use of an acquisition function. This allows users to tune hyperparameters accordingly. Candidate samples are then selected for evaluation based on the real objective function.
The application of Bayesian optimization to materials discovery is part of IBM’s Accelerated Discovery strategy. To test the algorithm, it was applied to a real materials discovery problem. The team asked the algorithm to search simultaneously for materials with good gas storage properties that could be easily observed in the lab. According to the results, they were able to achieve success thanks to the Batch Generalized Thompsons Sampling algorithm, which was previously developed to address multiple objectives, such as gas storage property score and lattice energy.

This has allowed for the acceleration of a technique called Energy Structure Function (ESF) Maps. While these are generally powerful for computational materials design, they are still too expensive for routine use.
IBM has been actively exploring using Bayesian machine learning outside traditional uses, such as hyperparameter tuning, in various science and engineering applications. In December 2020, the company launched the Bayesian Optimization Accelerator, a general parameter optimization tool that speeds up information inferencing with fewer samples and less CPU. The company has already packaged its advanced algorithm offerings into the IBM Bayesian Optimization (IBO) service, which will provide users with access to various machine learning models and programs without the need for data science or AI expertise.
This latest algorithm is no different and is expected to enable industries, such as pharmaceuticals, materials chemistry and drug discovery, to search for new materials more systematically and efficiently. However, IBM notes that there are still numerous obstacles to tackle. In a press release, the company shared how working with various constraints and objectives still remains a challenge without inevitably passing these complexities onto the user. Another challenge is taking into account the data from multiple sources during the optimization process.
The results of the research are published in Science Advances. Read the full paper here.
For more information, visit IBM’s official website.