Memory for AI: The Need for Speed


>> Electronic Design Resources
.. >> Library: Article Series
.. .. >> Topic: System Design
.. .. .. >> Series: Memory for AI

What you’ll learn:

  • The shift to processing at the edge.
  • How memories like HBM and GDDR will be crucial in effective AI data processing.

 

In Part 1 of this series, we explored the exponential growth of the world’s data that sees it doubling every two years. With all of this data, humans are turning to artificial intelligence (AI) to make sense of it all. Processing and memory must continue improving, but new challenges are emerging as well.

As our insights from data become more valuable, the need becomes greater for stronger security. Not just to protect the data, but the AI models, training data, and infrastructure as well. Unsurprisingly, the increasing value of the data and insights is motivating AI developers to develop more powerful algorithms, more meaningful datasets, and better security practices.

Faster processors and memory are needed to enable better algorithms and topologies, as well as to process the growing volumes of data. But two of the semiconductor industry’s primary tools for improving performance over the past several decades—Moore’s Law and Dennard Scaling—can no longer be relied upon like they once were. Moore’s Law is slowing, and Dennard Scaling broke down around 2005. To continue to meet the needs of cutting-edge AI applications and the exponential increase of data, the industry is being forced to find new ways to enhance performance and power efficiency.

At the same time, the architecture of the internet is steadily evolving. The familiar cloud processing model, in which endpoints capture data and transmit it to the cloud to be processed, is changing, too.  With the rollout of 5G, there will be more connected devices and more data, and it won’t be feasible to send it all to geographically distant data centers.

Shifting Processing to the Edge

As the world’s digital data continues to expand faster than the speed at which networks are improving, new processing opportunities will be found at the edge. The edge includes base stations and more geographically distributed processing locations that sit between cloud data centers and endpoints.

Edge data centers offer the ability to process data closer to where it’s being generated, enabling lower-latency processing for applications like autonomous vehicles. And by processing data at the edge, we can send a smaller amount of higher-value data to cloud data centers, easing the demand on network bandwidth. By doing so, we’ll see greater improvements in performance and power efficiency.

As edge processing becomes mainstream, the cloud will still be a crucial piece of our data economy and will continue to be used for the toughest tasks. In some cases, data captured by the endpoints will be sent to the edge for initial processing before being sent to cloud data centers. At other times, data will go straight to the cloud, where it may be combined with other data captured across wide geographies into one aggregate data set.

AI’s Expanding Role

We expect to see AI deployed across the evolving internet in data centers, the edge, and increasingly in endpoints, although the requirements and applications will differ for each. The most difficult tasks, including training and inference for the biggest neural networks that use large training sets, will remain in the cloud data centers. The highest-performance hardware solutions will be used, and they will be plugged into the wall for maximum power.

To support these high-performance AI solutions, the highest-performance, most power-efficient memory solutions will be needed.  Both on-chip memory and very high-performance discrete DRAM solutions like HBM and GDDR will be used in this space. Endpoint devices will predominantly remain battery-powered and continue to be used for inferencing, similar to what we see today. Memory systems for endpoint devices will typically use on-chip memories or low-power mobile memories.

Today, the definition of the edge has expanded and now consists of both the near edge (closer to the data-center core) and the far edge (closer to the end points). In the future, we’ll see the full range of memory solutions employed across the edge.

At the near edge, processors and memory systems will be similar to those that we see in the cloud, using on-chip memory, HBM, and GDDR. As we migrate toward endpoints, the processors and memory at the far edge will be similar to what we see in endpoints, including on-chip memories, LPDDR, and DDR. But we also expect to see some endpoint applications that require extremely high performance, like autonomous vehicles. For AI in these devices, memories such as GDDR are suitable for meeting performance and power-efficiency needs.

The resurgence of AI has revolutionized how the world stores, analyzes, and communicates data. And an interesting relationship has formed between AI and digital data. Large amounts of data are required to make AI better, and AI is in many cases the only reasonable way to make sense of large volumes of data. Looking forward, if AI applications are to continue realizing their full potential, then high-bandwidth, power-efficient memories need to be at the forefront. We’ll explore this further in Part 3.

>> Electronic Design Resources
.. >> Library: Article Series
.. .. >> Topic: System Design
.. .. .. >> Series: Memory for AI



Source link