Experts say SRAM-based architecture likely to complement, not replace HBM
Nvidia is anticipated to reveal a novel AI inference chip architecture based on on-chip Static Random Access Memory (SRAM) at the Nvidia GTC 2026 conference in San Jose, California, this Monday. This potential unveiling sparks questions about the future of AI memory market structures.
According to industry insiders, this proposed SRAM-based architecture diverges from current GPU designs in AI data centers. Today’s GPUs utilize High-Bandwidth Memory (HBM) stacks connected to the processor for rapid processing of massive datasets, ensuring high data throughput.
This new SRAM-focused design integrates larger SRAM blocks within the chip, minimizing data movement and potentially improving processing latency.
However, this approach involves design tradeoffs. SRAM is considerably larger and more expensive compared to Dynamic Random Access Memory (DRAM), which can limit large-scale deployments. Industry estimates suggest that SRAM cells require approximately five to ten times more silicon area than DRAM cells for the same capacity.
Consequently, SRAM has primarily served as cache or buffer memory in processors rather than main memory. Conversely, HBM is engineered for extremely high memory bandwidth, making it crucial for AI training and data center applications.
Concerns have arisen that broader SRAM adoption could diminish the demand for HBM and other main memory technologies. However, many experts believe SRAM is unlikely to directly substitute HBM because they fulfill distinct functions.
“Interpreting SRAM as a replacement for HBM is somewhat exaggerated,” stated an anonymous industry source. “SRAM has traditionally been used as a small-capacity but expensive cache memory located next to the processor.”
The source emphasized structural limitations hindering SRAM from replacing large-capacity memory.
“To achieve the same capacity, SRAM would require roughly five to ten times more silicon area than DRAM,” the source explained. “It may be useful in certain ultra-low-latency parts of AI chips, but its expansion as a general-purpose memory solution is likely to remain limited.”
“For the foreseeable future, HBM will continue to serve as the key near-memory supporting large-scale AI training and inference systems,” the source concluded.
Market analysts foresee SRAM-based architectures complementing existing memory technologies rather than superseding them.
Chae Min-sook, an analyst at Korea Investment & Securities, suggests understanding SRAM-centered architectures as an additional option for specific workloads, rather than a strategy to displace HBM or DRAM.
“It is more appropriate to see this as a solution targeting certain ultra-low-latency data center workloads or edge applications,” she said.
“Large-scale model training and general inference servers will still rely on HBM and DRAM as their main memory,” Chae added. “As AI computing evolves, the industry is likely to move toward a more layered memory hierarchy composed of SRAM, HBM and DRAM.”
Academics also anticipate limited short-term disruption to the existing memory landscape.
Lee Jong-hwan, a professor of system semiconductor engineering at Sangmyung University, predicts any structural shift will occur gradually.
“Even if architectural changes occur, they are unlikely to cause immediate disruption,” Lee said. “Companies such as Samsung Electronics and SK hynix dominate the global memory market, meaning any technology transition would likely proceed at a controlled pace.”
“SRAM is still one type of memory, so from the perspective of memory manufacturers it would not necessarily pose a major problem,” the professor added.
yeeun
