Originally designed for video games, graphics processing units (GPUs) now power much of today’s AI – from voice assistants to self-driving cars. Unlike regular computer chips, GPUs have thousands of small cores that handle many simple tasks at once.
GPUs that are built into larger devices process data locally – at single locations – called “the edge.” This approach saves power, speeds results, and keeps information more private because it doesn’t have to be sent to the cloud, said Roger Shen, assistant professor of electrical engineering.
Edge computing can handle small, local AI models that later can contribute to larger shared models through a process called “federated learning,” while preserving data security.

Another key to energy savings is through better-tailored algorithms, Shen said.
A tradeoff
Efficient algorithms can reduce the number of calculations needed and handle larger data sets without overwhelming servers.
“My focus is on the algorithm side,” Shen said. “This provides a promising solution to the energy consumption issue because an embedded system is simpler – it’s just focused on a specific task.”
There’s always a tradeoff, Shen noted. Using less power often means a model is less precise. But not every task requires perfection. In manufacturing, for example, AI may only need to decide a yes-or-no question: “Is this part defective or not?”
In such a case, a small loss in accuracy is inconsequential and worth the savings in speed and energy.
Embedded AI processing can flag defects in real time and prevent breakdowns before they happen, shaping the next wave of smart manufacturing.
Shen has seen the payoff firsthand while working with a local company.
“Our job was to dig into their production data to see if there was some circumstance that explained their problem,” he said.
