“I tell ya, I don’t get no respect,” was an opening line made famous by one of the funniest comedians of the last century, Rodney Dangerfield. And while decidedly not funny in the world of electronics, thermal engineering has, unfortunately, often been treated with less respect than it deserved. Dealing with the heat generated by electronics was often not given full consideration until after the design was completed and prototyped, and the problem manifests as a failure.
It’s a simple fact that where there are electrons flowing (superconductors aside), there is heat. “How much heat?” is the salient question, as well as, “Will it be a problem?” The next question is how to deal with it. To the first point, heat in electronics is almost always a problem. The reason is that there is an inverse relationship between heat exposure and the reliability of electronic devices. Integrated circuit transistors are vulnerable to failure due to diffusion of metals through insulators causing shorts. In short form, the higher the heat, the lower the reliability. Thus, keeping devices cool is a vital objective.
There is another reason to keep things cool, however, which is to mitigate the mechanical strain that is manifested when devices having vastly different coefficients of thermal expansion (CTE) are intimately joined, such as a BGA soldered to a PCB. It is a well-known fact that solder joint failure is a leading cause of assembly failure.
There used to be a saying employed by thermal engineers that helped succinctly frame both the challenge and the solution. “It all goes back to air,” and that has remained true since it was first observed and uttered, perhaps as early as the first vacuum tube amplifier. The challenge that has remained ever since is how to get the heat generated by electronics “back to air.”
There are multiple ways that heat can be managed. At the earliest steps, the choice of technology is important. To provide some perspective, the world’s first electronic computer, the ENIAC, had some 30 separate computing units plus a power supply. The system weighed some 60,000 pounds, was 100’ x 10’ x 3’, and contained roughly 19,000 vacuum tubes; 1,500 relays; hundreds of thousands of resistors, capacitors, and inductors; and required 500,000 hand-soldered interconnections. Its power consumption was about 200 kilowatts. The smartphones we carry in our pockets are several orders of magnitude both smaller and more powerful than the ENIAC, but use a fraction of the energy. Transistors are much more efficient; however, the energy density of the processor chip in watts per square millimeter is still arguably many times greater. It’s all a matter of perspective.
Both computers require cooling to perform efficiently. The ENIAC employed a forced-air cooling system to deal with the massive amount of heat generated by the tubes. Those who remember cathode ray televisions will likely remember just how warm the area around the TV was.
When it comes to managing heat, there are only three fundamental ways: conduction, convection, and radiation, as well as a number of ways to augment them. Of these, conduction is arguably the easiest, fastest, and most efficient, but conduction needs a thermal sink to further remove heat from the conduction source to keep it cool, and that is convection—the means by which the heat is transferred to air. Radiation is the least efficient (it’s also the way the Earth attempts to rid itself of excess heat at night as the Earth rotates and at least part of the reason global warming is a problem), but for electronics, all three methods can be and often are combined to keep things cool.
To help deal with the heat, thermal engineers have developed many clever solutions over the years to protect electronics from overheating. This often happens in concert with system designers. One such solution is what has been called a “stepped phased system protection” protocol. The first level of thermal protection is passive thermal protection. These include heat sinks, heat spreaders, heat pipes, and the like to remove heat directly through conduction aided by convection from the device (normally a CPU). If things get too hot for the passive and semi-passive solutions, a fan is often engaged to assist heat removal at the first thermal threshold. Additional sophistication and the use of software is the next solution, where the CPU/system clock speed is reduced to reduce energy generation when a threshold is reached. This is followed by an overheat condition warning to the user. If that fails, to get a response, the system will automatically shut down.
There have been many solutions to the thermal problem as watt densities increase. It is recommended that the designers familiarize themselves with both the solutions and to not ignore the importance of thermal interface materials (TIM), which are vital to assuring that the first thermal pathway is a good one.
Thermal challenges are unlikely to go away so long as electronics persist. Photonics have been suggested as one prospective solution. There have also been suggestions that biologic computers using neural networks assembled by DNA are in the future by some futurists, but—come to think of it—isn’t that what humans are?
Stay cool—and give all those thermal management engineers some well-deserved respect.
This column originally appeared in the September 2020 issue of Design007 Magazine in the FLEX007 section.