More Efficient AI Training

More Efficient AI Training

Training neural networks requires enormous computing resources—until now. A new method is set to save a significant amount of energy.

Researchers at the Technical University of Munich (TUM) have developed a method that is expected to work a hundred times faster and be much more energy-efficient. Instead of following an iterative, step-by-step process, the parameters are calculated directly from the data based on probability. The quality of the results is said to be comparable to conventional iterative methods. The university describes the concept behind this approach in the following press release.

More Efficient Training

AI applications, such as large language models (LLMs), have become an integral part of our daily lives. The required computing, storage, and transmission capacities are provided by data centers. However, the energy consumption of these centers is enormous: in 2020, it amounted to approximately 16 billion kilowatt hours in Germany—around one percent of the country’s total electricity demand. By 2025, this figure is expected to rise to 22 billion kilowatt hours.

Furthermore, increasingly complex AI applications will place even greater demands on data centers in the coming years, requiring vast computing resources to train neural networks. To address this challenge, TUM researchers have developed a method that is a hundred times faster and delivers results as accurate as those from conventional training methods. This breakthrough significantly reduces the power required for training.

Neural networks, which are used in AI for tasks such as image recognition and language processing, function similarly to the human brain. They consist of interconnected nodes, known as artificial neurons. These neurons receive input signals, weight them with specific parameters, and sum them up. If a defined threshold is exceeded, the signal is passed on to subsequent nodes.

To train the network, parameter values are typically initialized randomly, often following a normal distribution. These values are then gradually adjusted through small changes to improve the network’s predictions. Since this training method requires numerous repetitions, it is highly time-consuming and energy-intensive.

A New Method

Felix Dietrich, Professor of Physics-Enhanced Machine Learning, and his team have developed an alternative approach. Instead of iteratively determining the parameters between nodes, their method relies on probability calculations. This probabilistic approach strategically selects values that are located at critical points in the training data.

The method focuses on points where values change rapidly and significantly. The current study aims to use this approach to model energy-conserving dynamic systems from data. Such systems evolve over time according to specific rules and can be found in areas like climate modeling and financial markets.

"Our method allows us to determine the necessary parameters with minimal computational effort, enabling much faster and more energy-efficient neural network training," explains Felix Dietrich. "Furthermore, our approach has been shown to achieve accuracy comparable to that of iteratively trained networks."