This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
In recent years, power density of state-of-the-art microprocessors continually increased together with their increase in performance and complexity. This evolution is expected to continue in the next years in spite of different limitations already exists in the process of integrated circuits manufacturing [1,2]. Design of high performance circuits currently is facing difficulties in sustaining supply and threshold voltage scaling to provide the required performance increase, limit energy consumption, control power dissipation, and maintain reliability . Because the energy consumed by the integrated circuits it is converted into heat the thermal aspects of the high performance circuits are also important during all phases of the manufacturing process and usage lifetime. The higher power density leads to increased heat dissipation and consequently higher operating temperature .
All computing segments face power and thermal challenges . Power consumption limitations and are important from different perspectives: can be seen in mobility of battery powered devices and in energy costs of servers and data centers. Thermal dissipation limitations are seen in high cooling costs for high-performance systems and in reliability problems due to the increased operating temperatures. The main problem of power consumption and heat dissipation challenges are due to the fact that improvements in the battery and cooling technologies, are relatively slow and do not follow Moore's law.
Thermal behavior and power consumption are expected to be important design parameters for the next generation of microprocessors, both for high-performance and mobile computing [2,3]. Proper thermal management depends on two major elements: thermal packaging and thermal management. Thermal packaging includes heat-sink properly mounted to the processor and effective fans directing airflow through the system chassis . Dynamic thermal management (DTM) strategies are used to reduce packaging cost without unnecessarily limiting performance, the package is designed for the worst typical application and any application that dissipate more heat should activate an alternative, run-time thermal management technique .
Dynamic thermal and power management strategies are related however, dynamic thermal management (DTM) is a distinct research domain that has been a hot topic in last years [1,2]. This is because power-aware design alone has failed to address all thermal aspects, mainly the hot spots occurring because power dissipation is non-uniform across the chip. The purpose of DTM is to achieve high-performance computing while maintaining the chip below a safe temperature . However DTM techniques usually are implemented at the lower-level layers in a computing system: hardware level, BIOS level and OS level. Addressing thermal management at higher levels of a computing system does not replace the DTM already implemented at lower levels, but extends these techniques to user applications and permits applications to adapt themselves in order to reduce the heat dissipation of the whole software and hardware system.
Thermal management research efforts were mainly focused on thermal packaging and new cooling solutions and second on lower levels dynamic thermal management mechanisms. With our work we address the higher level thermal management and awareness of user application.
The present paper tries to describe thermal profiling of software applications. In fact we want to know the thermal impact on different system components due to the running software applications. The final goal of our work is to establish a set of rules for thermal-aware applications and to implement them in server side, desktop or mobile applications.
2. Related work
This paper will continue with a brief description of other recent or important papers in this research field. As temperature increase is directly related to the consumed power, techniques that aim to decrease the power consumption achieve temperature reduction as well. But power management techniques alone cannot be fully efficient for thermal management because of local fast chip heating. Local hot spots occur much faster than chip wide heating, because power dissipates non-uniform inside the chip. Therefore low-power techniques have insufficient impact over the local operating temperature . In this study we want to address only runtime DTM strategies implemented at the operating system, middleware or user applications levels for single core and multi-core CPU architectures.
2.1. DTM techniques and control algorithms
The DTM implemented at the higher architectural levels of the system is unique due to its ability to use runtime knowledge of the application behavior and the current thermal status of different internal units of the CPU to control tasks execution to control thermal behavior . There are two categories of DTM techniques: heat reduction techniques and heat distribution and balancing over the chip area [8,9]. In the first category the different available power management techniques are considered to be applicable to thermal management: stop-and-go or clock gating used as simple as to stop a core or an internal unit when high temperatures are detected; dynamic voltage and frequency scaling (DVFS) is used to control voltages and frequency of the CPU or core to reduce chip temperatures. The second category considers thread or workload migration from one core or unit to another one in order to balance overall temperature map of the chip.
Multicore architectures are becoming the main design paradigm for current and future processors, including mobile processors, because these architectures provide increased parallelism within the energy and thermal limits needed by such devices. The authors of  present a detailed study of different DTM techniques that can be used in multicore designs. They compare different DTM techniques like Stop-go and DVFS with distributed or global control and different methods of OS-based thread migration. Based on the obtained results they consider that thread migration and dynamic voltage and frequency scaling techniques are the most promising methods to control temperature in multicore architectures.
The authors of  explore various thermal management techniques that exploit the distributed nature of multicore architectures. They use HotSpot thermal simulations at architectural level to profile different SPEC 2000 workloads on a 4-cores CPU. In their study the authors consider different parameter and schemes for DVFS and thread migration (TM) techniques: global and local DVFS, temperature based TM, counter based TM and power based TM. They have similar conclusions like the ones presented in .
In  an architecture-level parameterized transient thermal behavioral modeling algorithm for emerging thermal related analysis and optimization problems for high-performance chip-multiprocessor design is proposed. The parameterized transient thermal behavioral models are build from the measured or computed thermal and power information at the architecture level. These data can be used to as a feedback in the control loop of DTM system level implemented algorithms. Their model can include a number of parameters such as location of thermal sensors in a heat sink, different internal components and different materials.
Kadin and Rada  proposed to develop a frequency planning methodology that maximizes the total performance of multi-core processors and that limits their maximum temperature as specified by the design constraints. Based on their results they show that it is possible to boost the total performance with no effect on the maximum cores' temperatures. This technique could be also used to develop control algorithms for local-DFS in order to maximize overall performance.
Most existing heat reduction DTM solutions rely on different optimization algorithms with the assumption that power can be estimated accurately, while others adopt oversimplified feedback control strategies to control power and temperature separately, without any theoretical guarantees. Effective control algorithms need to be developed to maximize the performance delivered per watt while controlling both the power consumption and temperature of a multi-core chip to respect some constraints . Wang et. al. identify in their work  several major challenges in developing power and thermal unified DTM: (a) next multi-core processors will have implemented local or per-core DFVS then special thread synchronization and simultaneous DFVS will be needed; (b) current software applications are single threaded then different workload migration methods should be investigated; (c) DTM control algorithms should be based on runtime workload variations in order to be self-adaptive and (d) both power and thermal aspects are important and they should be addressed together by next management algorithms. Based on these requirements the authors in  proposed a chip-level temperature-constraint power control algorithm based on Model Predictive Control theory. The resulted algorithm can precisely control the power of a multi-core chip to the desired values while maintaining the temperature of each core below a specified threshold and based on their test this algorithm outperform current state-of-the-art DTM algorithms.
2.2. Operating system level DTM
Most of the papers addressing DTM techniques at OS level consider the problem of task-aware scheduling. The authors of  investigate in their work the benefits of thermal-aware task scheduling with minimum performance degradation. They consider that the task scheduler in a multi-tasking system can use thermal metrics when running different active tasks in order to reduce the number of cycles above the thermal threshold allowed for the microprocessor . A number of four task scheduling policies were investigated (random, average temperature, maximum temperature and minimum temperature) and two DTM techniques: DVS and fetch gating.
Current DTM algorithms are reactive in nature because they take corrective action after temperature reaches a predetermined threshold value [14,15]. The authors of  propose to implement a predictive DTM based on application-based thermal models and core-based thermal models. They changed the Linux kernel running on an Intel Quad-core CPU to implement their proposed solution in the task scheduler. Based on the results presented in  using predictive DTM the overall temperature can be reduced with minimum performance degradation and provides fairness among available cores. In  the authors investigate how to use predictors for forecasting future temperature and workload dynamics, and propose proactive thermal management techniques for multiprocessor system-on-chips.
Other papers address the problem of thermal-aware task scheduling and propose different algorithms to be implemented in the operating system [16,17,18,19]. From on all these papers two problems arise and have to be solved: (a) proper thermal sensing or modeling and (b) power consumption variation based on runtime software workload. It is also known that the accuracy of the thermal measurements directly impacts both the performance of the thermal management system and the performance of the CPU .
In the core of DTM schemes lies accurate reading of on-die temperatures . Consequently, distributed temperature and leakage sensors should be used to inform dynamic thermal management strategies . Thermal management techniques based on on-line temperature sensing depend on monitoring sensors placement inside the processor chip and cores. In  the authors propose three techniques to create sensor infrastructures for monitoring the maximum temperature on a multicore system. They investigate the number of sensors and their placement, the number of active sensors and their selection in order to collect and predict the maximum temperature of each core in the microprocessor.
There are several causes for temperature measurement inaccuracy : (a) parameter variance of non-ideal thermal diode during the manufacturing process; (b) analog to digital conversion accuracy; (c) proximity to the hot spot - CPU performance and reliability is limited by the temperature of the hottest location on the die and (d) manufacturing temperature control. Based on all these inaccuracy sources for DTS measurements accuracy we can say that the DTM algorithms should consider them and minimize their effect.
2.3. Software application level DTM
Software level thermal management becomes an attractive extension to low level management techniques. In  the authors describe a software solution for temperature sensing methodology that can be used for thermal profiling of software applications. The temperature model is based on the interface with some system components parameters through performance counters.
In  a software framework for dynamic energy efficiency and temperature management for computing systems is presented. The authors address both energy and temperature in a unified approach which combines a suite of energy-management techniques that can be activated individually or in groups according to a given policy. The evaluation has shown that the proposed framework  is very effective, because it delivers a 40% energy reduction with only a 10% application slowdown.
An important aspect addressed also in our work is the unification of power consumption and thermal management [21,23]. In  the authors present the design and the evaluation results of a power management framework that address both energy efficiency and thermal constraints in a unified manner. The goal of the framework is to maximize energy savings without extending application execution times too much, and to guarantee that the temperature remain bellow a given limit. In order to achieve these goals the framework implements and combine several DPM-DTM technologies, can activate them individually or in group based on predefined policies.
With our work we want to address the power-thermal aspects in a unified mode at the higher levels of the system. We consider that user applications can provide important proactive data that can be used in the thermal management control loop and we base these software level assumptions on the following measurements of the existing thermal sensors. The paper is organized as follows. In the next section we define the thermal profiling concepts and the software tool we implemented to characterize the thermal behavior of a mobile device and its applications. Some experimental results are presented in Section 4 and Section 5 contains our concluding remarks.
3. Thermal profiling of multi-core processors
3.1. Thermal benchmark software
We build the thermal profiles for mobile applications using the concept of thermal benchmarks. We defined thermal benchmark as a software application that characterizes the thermal behavior of the system, component or application with respect to certain stimulus (workload). A thermal benchmark must by able to distinguish the way a hardware device temperature is increasing with workload and the way its temperature decrease when the workload is finished . We address with our tests the CPU cores temperatures.
Fig. 1. Thermal benchmark definition
A thermal benchmark is composed from three components (Fig. 1):
The first range [0-t1), is intended for idle mode temperature. During this time interval the CPU is in the idle state, the power management and saving mechanisms of the CPU are prevented to occur and the system's and component's parameters are monitored. This interval is used to estimate the idle state temperature of the CPU cores and to let them to achieve their idle state temperature.
The second range [t1-t2) represents the heating phase, when a certain workload is executed. SPEC CPU2000 or any type of software applications can be executed as workload. We used in our tests one integer benchmark for the CPU. During this time interval, the component executes its job and the system's and component's parameters are continuously measured. While the CPU is running the benchmark its temperature increase in time until its cores achieves their equilibrium state.
The last range [t2-t3) represents the cooling phase intended for the component to reach again the idle state temperature. Within this step, the CPU is idle and the parameters are monitored. During this interval the CPU cores' temperatures decrease until the idle temperature is reached.
Running the benchmark on the same system a large number of times and averaging the measured temperatures we obtain the standard thermal signature of a certain processor for the selected workload (Fig. 2 and Fig. 3).
3.2. Thermal profiling test cases
The thermal dissipation problem of computing systems is in general a very complex one because each physical component from the system has its own temperature values depending especially on the execution operation type, so that we can say that together with the physical components, the software applications has a big influence on the temperature generation.
Fig. 2. Thermal signature for a processor
Fig. 3. CPU load when integer benchmark was applied
For every heat source in the system we can establish different thermal profiles (or thermal signatures) which express the thermal dissipation of the component for a given utilization profile (e.g. applied stimuli or workload). Every thermal profile is identified by a set of temperature values corresponding to different component usage models. A component usage model assumes a certain workload level of the running component. The temperature profiles we define establish a relationship between power states, workloads and energy consumed by the device or component being in these states.
The process of extracting temperature profiles for a computing system is called system thermal profiling or characterization. In order to extract thermal profiles for CPU we propose a set of test that can be run multiple times for every system.
CPU thermal profiles present a description of the CPU heat dissipation over the time when it executes different workloads with well known parameters. CPU thermal profiles are considered as a characteristic feature of the CPU because when it executes a specific workload a number of times with the same parameters, the same temperature profiles are obtained. In order to produce the CPU thermal profiles a number of tests are further introduced. CPU thermal profiling test cases are based in the thermal benchmarks introduced before. During thermal characterization the tests are running for a certain amount of time when the measurements and efficiency metrics are collected and recording by the framework.
CPU idle temperature: The CPU idle state thermal profile describes the temperature variation over the time when the CPU is idle and do not execute any application, except the default operating system services. When the idle test is executed the CPU configuration parameters are set to default values, no workload is applied and no other user application or interaction is allowed. CPU default parameters mean that all cores are active and idle.
CPU heating and cooling related profile: The CPU cores have different profiles when heating during certain workload and cooling when the workload is finished. Based on heating and cooling ratio we can estimate the times needed for cooling for a certain workload applied in order to achieve the initial temperature existing before workload.
CPU workload type temperature: The CPU generates distinct heat levels for any instruction it executes. Some complex instructions generate higher temperatures to complete than the simple CPU instructions. However, at the higher levels the system cannot seize the differences between the thermal levels of two instructions, but we need an idea of heat dissipation of different instruction classes: integer, memory, floating point, etc. In case we know what kind of operations an application thread uses we can estimate its temperature for a specific workload.
CPU usage level temperature: The same workload the CPU can execute at different usage levels which may imply different temperature levels for the same type of workload. Using this profiling test we want to emphasize the relation between temperature and CPU usage or CPU time for certain workloads. Inside this test the same workload is repeatedly executed with different sleeping times in order to achieve different values for CPU usage.
CPU multithreading temperature: Another proposed CPU profile test is to launch the same algorithm workload on different thread counts using one single core. Using this test the OS task scheduling and switching operations along with workload operations are observed in order to get their temperature. In a multithreading test case it creates a number of threads running the same workload in order to see how thread count influence the overall temperature.
CPU multicore temperature: For multicore processors two new tests are needed to establish the relation between the number of active cores and their individual and cumulated temperatures. First, the test is used to activate every core one at the time to run the same workload, in order to see how much heat generates every active core. Next, the test activates successively step by step one more core while keeping the previous active and run the same workload on every active core, in order to emphasis the increasing temperatures for every new active core.
3.3. Thermal profiling tool
We implemented the thermal benchmarks and thermal test cases in a software application. Fig. 4 presents the overall architecture of this application. From the beginning we intended to build a portable monitoring application in order to run benchmarks on different platforms characterized by different hardware devices and operating systems. A design constraint was that we required the application to be scalable in order to easily support new types of sensors and chipsets, new types of measurement parameters and new types of workloads to be applied. In order to achieve a portable and scalable application we split it in a number of specialized modules: battery monitor, external power consumption monitor, thermal monitor, CPU and cores monitor and task monitor. More detailed data on the profiling application can be found in .
Fig. 4. Thermal profiling application architecture
A thermal benchmark can be applied to every hardware device with built-in thermal monitoring capability (such as microprocessor, hard disk or video). The majority of microprocessors produced in last years include at least one built-in thermal sensor to aid in thermal management of servers, workstation or portable systems. As this thermal sensor is connected to a thermal diode on the processor core, it provides the earliest indication of thermal variation that can be read through our hardware monitor module.
4. Thermal profiles of software applications
The proposed test cases were run on one device: Fujitsu Siemens laptop, E series, Intel Core Duo T2500 mobile processor, 2 GHz, technology 65 nm, thermal design power 31 W, thermal specification 100oC, 2 GB memory. Every test case was run a number of times as much as possible at the same external conditions (e.g. temperature). Every test last 15 minutes and follows the same pattern: 5 minutes idle - 5 minutes workload and 5 minutes idle. While a test is run, the system parameters are continuously monitored: battery discharge rate, CPU temperatures and core temperatures, CPU usage and cores usages, CPU usage for the benchmark application. These parameters were logged and were processed and analyzed after the test.
4.1. CPU Thermal Profiles
Thermal profile of a processor when executing high intensive computations is shown in Fig. 2. The CPU load is 100% and CPU temperature increase in time until the heat balance is achieved. When the workload is executed on a multicore architecture the single treaded benchmark application is continuously switched from one core to another on the basis of OS task scheduler implementation. The overall CPU usage for the single threaded integer test application is 50% and the application receives different processor time from every core (Fig. 5). Thermal profiles of CPU and CPU cores when the same integer workload is executed are presented in Fig.6. The CPU cores are not at the same temperature because they are not balanced used.
Fig. 5. CPU and cores usage for single thread test
Fig. 6. CPU and cores temperatures for single thread test
The same CPU workload when executed at different CPU loads could have different thermal profiles function of the workload is implemented in software. For example the same workload was implemented in three ways:
- full performance (CPU usage = 100%) - Fig. 7, the workload is implemented for best performance.
- short delays (CPU usage = 50%) - Fig. 7, inside the workload algorithm, after a number of iterations short period Sleep function calls were introduced. Depending on the frequency of Sleep calls different CPU loads we could achieve.
Fig. 7. Full performance and short sleeps workload implementations
- long delays (CPU usage = 50-75%) - Fig. 8, inside the workload algorithm, with a selectable frequency, long delays are introduced. Depending on the delays periods and frequency different CPU loads could be achieved, but we obtained also different thermal profiles. Therefore, in some cases we can control and adapt a software application in order to reduce the heat dissipation due to this application (Fig. 8).
Fig. 8. Long sleeps workload implementations
Running the same test with different delay parameters, so that a workload is executed with different CPU usage levels the plots in Fig. 9 were obtained.
Fig. 9. (a) CPU core temperature versus CPU core usage;
Fig. 9. (b) CPU average temperature versus CPU usage
In Fig. 9 (a) the same integer workload was run on core 0 at 100%, 90% and 70% CPU core usage levels. Four groups of dots can be observed: one cluster for the idle state temperatures and three clusters for the workload temperatures. The same pattern we can observe for the relation between CPU usage levels and average CPU temperatures. There is direct relation between CPU core temperatures and the CPU time (in terms of usage level) for the same type of workload, therefore we can estimate the contribution of the application to the CPU heat generation.
Another aspect we investigated was the influence of different workload types on CPU cores' temperatures. Fig. 10 shows the thermal profiles of four workload types: integer, float, memory and SSE executed successively for the same amount of time on the same CPU core. Based on this test, in order to estimate the effect of a CPU intensive application over the temperature we have to know the type of operations the application implements.
Fig. 10. CPU cores workload temperatures
4.2. Heat dissipation and power consumption
Part of the battery energy of a mobile device is transformed into heat. The increase in temperature enforces more energy to be consumed. In Fig. 11, power consumption profile for the previous memory workload is presented.
Fig. 11. Heat dissipation and power consumption
During the second phase of the benchmark, when the workload is applied, the temperature of the processor and also the temperature of the entire mobile device increase (in our example the temperature increases from 60 to ~100oC). This increase in temperature of has an effect on power consumption, and a smooth increase (from 30W to 34W) during phase 2 of the benchmark can be observed in the current profile. This increase of approx. 4W during the workload execution is due to the heating of the device.
4.3. Multithreading and Multicore Thermal Profiles
First we run one single workload thread on every CPU core available in the processor: when the workload was run on core 0 the plot in Fig. 12 (1) was obtained and when it was launched on core 1, the plot (2) describes the temperature variation. When the workload thread set its affinity to both CPU cores, we obtained the plot (3) in Fig. 12. For this case the operating system schedules the thread on both cores uneven (in our presented test: 75% core 0 and 25% core 1). It can be also observed that the CPU cores' temperatures are not equal even if they run the same workload 100% (Fig. 12 (1) and (2)).
Fig. 12. CPU cores temperatures
The second test presented in Fig. 13 describes the results of (1) one workload thread executed by one core, (2) two workload threads executed by one single core and (3) two threads run by both CPU cores.
Fig. 13. CPU threads temperatures
4.4. Heating-cooling profile
We compare the heating profile when the certain workload is applied and the cooling profile when the workload was finished (Fig. 14). Based on this type of test we can obtain the time needed for the processor to cool down to the initial temperature after the workload was finished.
Fig. 14. CPU heating and cooling
4.5. Thermal efficiency
We introduce and draw the thermal efficiency profile as the number of workload (application specific) operations per increase of 1 Celsius degree. Thermal efficiency specify how efficient is used every increase in temperature due to the execution of the workload in the selected parameters.
Fig. 15.a CPU cores thermal efficiency
Fig. 15.b CPU workload thermal efficiency
4.6. Thermal-Aware Application
We implemented a database application with long database processing tasks. The application was written in Visual C++ 2005, uses MS SQL server. The database processing task was implemented with different thermal management operations. In Fig. 16 thermal signatures for the same task workload implemented with different thermal management techniques are presented. We can reduce maximum CPU temperature with around 10oC implementing DTM at application level (AP) with a decrease in performance of 40%.
Fig. 16. Heat dissipation and power consumption
The work presented in this paper tried to evaluate the thermal response of the mobile system CPU and its cores related to the running software applications. We tried to characterize the thermal impact of system CPU cores due to the workload threads of the executed applications. We investigated the possibility to identify the effect of every running application in the system over the CPU, cores and system temperatures. We proposed a set of test cases to identify relation between different system's and application's parameters and temperature. Based on our experiments, the process of thermal effect split among running applications is not a simple task and depends on many factors. For certain cases and workloads we can do that split based on workload type, CPU usage level and threads and cores used.
This work was supported by Romanian Ministry of Education CNCSIS grant 680/19.01.2009.
 S.V. Garimella, A.S. Fleischer, J.Y. Murthy, A. Keshavarzi, R. Prasher, C. Patel, S.H. Bhavnani, R. Venkatasubramanian, R. Mahajan, Y. Joshi, B. Sammakia, B.A. Myers, L. Chorosinski, M. Baelmans, P. Sathyamurthy, and P.E. Raad, Thermal Challenges in Next-Generation Electronic Systems, IEEE Transactions on Components and Packaging Technologies,vol. 31, no. 4, pp. 801-815, Dec. 2008.
 K. Skadron, M.R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan, Temperature-Aware Computer Systems: Opportunities And Challenges, IEEE Micro, Dec. 2003.
 S.S. Sapatnekar, Temperature as a First-Class Citizen in Chip Design, Proceedings of 15th International Workshop on Thermal investigations of ICs and Systems, THERMINIC 2009, Oct. 2009.
 E. Rotem, J. Hermerding, A. Cohen, H. Cain, Temperature measurement in the Intel Core Duo Processor, Proceedings of 12th International Workshop on Thermal investigations of ICs, THERMINIC 2006, Sep. 2006.
 S.H. Gunther, F. Binns, D.M. Carmean, and J.C. Hall, Managing the Impact of Increasing Microprocessor Power Consumption, Intel Technology Journal, Q1, 2001.
 K. Skadron, M.R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan, Temperature-aware Microarchitecture, Proceedings of the 30th ACM/IEEE International Symposium on Computer Architecture, San Diego, CA, June 2003, pp. 2-13.
 V. Szekely, M. Rencz, and B. Courtois, Tracing the Thermal Behaviour of ICs, IEEE Design&Test of Computers, Vol. 15, No. 2, April-June 1998, pp. 14-21.
 P. Chaparro, J. Gonzalez, G. Magklis, Q. Cai, and A. Gonzalez, Understanding the Thermal Implications of Multicore Architectures, IEEE Transactions on Parallel and Distributed Systems, Vol. 18, No. 8, Aug. 2007, pp. 1055-1065, 2007.
 J. Donald and M. Martonosi, Techniques for Multicore Thermal Management: Classification and New Exploration, Proceedings of the 33rd International Symposium on Computer Architecture, ISCA 2006, 2006.
 D. Li, S.X.D. Tan, E.H. Pacheco, and M. Tirumala, Parameterized Transient Thermal Behavioral Modeling For Chip Multiprocessors, Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2008, Nov. 2008.
 M. Kadin and S. Rada, Frequency Planning for Multi-Core Processors Under Thermal Constraints, Proceeding of the 13th international symposium on Low power electronics and design, ISLPED 2008, Aug. 2008.
 Y. Wang, K. Ma, and X. Wang, Temperature-Constrained Power Control for Chip Multiprocessors with Online Model Estimation, Proceedings of the 36th International Symposium on Computer Architecture, ISCA 2009, Jun. 2009.
 E. Kursun, C.Y. Cher, A. Buyuktosunoglu, and P. Bose, Investigating the Effects of Task Scheduling on Thermal Behavior, Third Workshop on Temperature-Aware Computer Systems, TACS 2006, 2006.
 I. Yeo, C.C. Liu, and E.J. Kim, Predictive Dynamic Thermal Management for Multicore Systems, Proceedings of the 45th annual Design Automation Conference, DAC 2008, Jun. 2008.
 A.K. Coskuny, T. Simunic, and K.C. Gross, Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs, to appear in IEEE Transactions on Computer-Aided Design, 2009.
 S. Zhang and K.S. Chatha, Approximation Algorithm for the Temperature-aware Scheduling Problem, Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design, ICCAD 2007, Nov. 2007.
 R. Jayaseelan and T. Mitra, Temperature aware Task Sequencing and Voltage Scaling, Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2008, Nov. 2008.
 E. Musoll, A thermal-friendly load-balancing technique for multi-core processors, Proceedings of the 9th International Symposium on Quality Electronic Design, ISQED 2008, Mar. 2008.
 K. Stavrou and P. Trancoso, Thermal-Aware Scheduling for Future Chip Multiprocessors, EURASIP Journal on Embedded Systems, Vol. 2007, doi:10.1155/2007/48926, 2007.
 J. Long, S.O. Memik, G. Memik, and R. Mukherjee, Thermal Monitoring Mechanisms for Chip Multiprocessors, ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 2, Aug. 2008.
 K.-J. Lee and K. Skadron, Using Performance counters for Runtime Temperature Sensing in High-Performance Processors, Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05), 2005.
 Ke Meng, Russ Joseph, Robert Dick, and Li Shang, Multi-optimization power management for chip multiprocessors, Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, Canada, 2008.
 M. Huang, J. Renau, S.M. Yoo, and J. Torrellas, The Design of DEETM: a Framework for Dynamic Energy Efficiency and Temperature Management, Journal of Instruction-Level Parallelism, Vol. 3, 2002.
 Marcu Marius, Vladutiu Mircea, Moldovan Horatiu, Popa Mircea, Thermal Benchmark and Power Benchmark Software, Proceedings of the 12th IEEE International Workshops on THERMal Investigations of ICs and Systems, THERMINIC 2006, Nice, France, Sep. 2006, pp. 203-208.
 Marcu Marius, Dacian Tudor, Moldovan Horatiu, Sebastian Fuicu and Popa Mircea, Energy characterization of mobile devices and applications using power-thermal benchmarks, Microelectronics Journal, Vol. 40, No. 7, pp. 1141-1153, Jul. 2009.