9 December 2021
The article of researchers Sobhan Niknam, Anuj Pathania, and Andy Pimentel from the Parallel Computer Systems (PCS) group in the Informatics Institute (IvI) was accepted by ICCD, a premier conference in the area of computer design and Electronic Design Automation (EDA).
Processors are at the heart of any modern computing device. Initially, these processors only had one processing core. Today’s high-performance processors are of multi-core designs, integrating a number of processing cores on a single die. Dissipating the heat generated from on-chip computations is the number one challenge for these processors. If the heat generated is not dissipated adequately, it will cause the transistors in the chip to be permanently and irreparably damaged. Therefore, it is essential to ensure that the processor manages the on-chip temperatures of the cores correctly using system-level thermal management techniques.
Power budgeting is a commonly used technique to manage these on-chip temperatures safely. Power budgeting assigns a power budget to a core (measured in Watts). As long as the power budget assigned is respected, the algorithm ensures there will be no thermal violations. The technique relies upon an accurate calculation of the power budget for every individual core in the processor. The temperature of the cores is guaranteed to be within the safe limit if every core in the processor respects the power budget assigned to it during power budgeting. Different cores in the same processor can even have different voltages and frequencies, if the Dynamic Voltage and Frequency Technology (DVFS) allows it. By combining the two technologies, an operating system can deploy power budgeting algorithms to calculate the power budget and subsequently use DVFS to ensure compliance.
Existing state-of-the-art techniques for power budgeting in multi-core processors perform power budgeting for cores based on their projected steady-state temperatures: it is a predictive approach. The steady-state temperature is the stable temperature the core will eventually thermodynamically attain when consuming a certain amount of power indefinitely. Steady-state temperature-based power budgeting calculates the power budget of the cores based on the number and location of active cores in a processor. Even though such power budgeting ensures a processor's thermally-safe operation, it turns out to be too cautious in its approach. Consequently, the temperature of the cores is significantly lower than the critical thermal threshold. Therefore, the power budgeting algorithm wastes a lot of thermal headroom that one could have used for doing more computation.
Recently, UvA researchers proposed a new power budgeting algorithm called T-TSP that performs power budgeting based on the current instantaneous temperature of the cores rather than their long-term steady-state temperature. T-TSP assigns cores a high power budget when they are cold (at room temperature). However, in the longer term this high power budget will lead to thermal violations. Therefore, the T-TSP algorithm reduces the power budget assigned to the core when its transient temperature rises to prevent the temperature from exceeding the thermal threshold. This alternative approach with power budgeting based on the transient temperature quickly raises the core's temperature close to the thermal threshold. It then keeps it there without ever violating the threshold itself. Consequently, T-TSP exploits almost the entire available thermal headroom while still ensuring the processor's thermally-safe operation. Detailed system simulations have shown power-budgeting with T-TSP can boost the performance of an application on a 64-core processor by up to 17.94% compared to existing state-of-the-art steady-state temperature-based power-budgeting techniques. In Figure 2 it is to be seen that the system’s response time improves when using T-TSP on a range of benchmark applications.
Paper T-TSP: Transient-Temperature Based Safe Power Budgeting in Multi-/Many-Core Processors
Code SobhanNiknam/ttsp
Website T-TSP