More Efficient AI Training

More Efficient AI Training

Training neural networks requires enormous computational resources. Until now. A new method is set to save a significant amount of power.

Researchers at the Technical University of Munich (TUM) have developed a method that is supposed to be a hundred times faster and, thus, significantly more energy-efficient. Instead of proceeding iteratively, step by step, the parameters are directly calculated from the data based on their probability. The quality of the results is said to be comparable to the previously common iterative methods. The underlying concept is described by the university in the following press release.

More Efficient Training

AI applications, such as large language models (LLMs), are now an indispensable part of our everyday lives. The required computing, storage, and transmission capacities are provided by data centers. However, the energy consumption of these centers is enormous: in 2020, it was around 16 billion kilowatt-hours in Germany—about one percent of the entire country’s electricity demand. A rise to 22 billion kilowatt-hours is projected for 2025.

Additionally, more complex AI applications in the coming years will further increase the demands on data centers. These applications require enormous computational resources to train neural networks. To counter this development, TUM researchers have developed a method that is a hundred times faster and delivers results with comparable accuracy to previous training methods. This significantly reduces the required energy for training.

Neural networks, used in AI for tasks like image recognition or language processing, are inspired by the human brain in their operation. They consist of interconnected nodes, called artificial neurons. These receive input signals, which are then weighted and summed up according to specific parameters. If a set threshold is exceeded, the signal is passed on to the subsequent nodes.

For network training, the parameter values are usually initially chosen randomly, for example, following a normal distribution. These values are then gradually adjusted through small changes to improve the network’s predictions. Since this training method requires many repetitions, it is extremely time-consuming and consumes a lot of energy.

New Method

Felix Dietrich, Professor of Physics-enhanced Machine Learning, and his team have now developed a new method. Instead of determining the parameters between the nodes iteratively, their approach is based on probability calculations. The probabilistic method used here focuses on selectively using values found at critical points in the training data.

It thus focuses on areas where the values change particularly strongly and quickly. The current study aims to learn energy-preserving dynamic systems from data using this approach. Such systems change over time according to specific rules and are found in climate models and the financial market, among others.

"Our method allows the required parameters to be determined with minimal computational effort. This makes it possible to train neural networks much faster and, thus, more energy-efficiently," explains Felix Dietrich. "Moreover, it has been shown that the new method is comparable in accuracy to networks trained iteratively."