Orthogonal Design Method Cache Memory

How to approach AI hardware design to address the memory wall?

This article outlines the design strategies currently used to address these bottlenecks, ranging from data center systolic ...

Nvidia says it can shrink LLM memory 20x without changing model weights

Nvidia's KV Cache Transform Coding (KVTC) compresses LLM key-value cache by 20x without model changes, cutting GPU memory costs and time-to-first-token by up to 8x for multi-turn AI applications.

Scaling Data Center Infrastructure: Marvell Debuts the Structera S

Upgrade your data center infrastructure with the Marvell Structera S CXL switch. Dynamically allocate resources and lower TCO. Get the specs!

Nvidia BlueField-4 STX adds a context memory layer to storage to close the agentic AI throughput gap

Nvidia's BlueField-4 STX reference architecture inserts a dedicated context memory layer between GPUs and traditional storage, claiming 5x token throughput and 4x energy efficiency for agentic AI ...

Semiconductor Engineering

AI Design Reshapes Data Management

Integrating AI into chip workflows is pushing companies to overhaul their data management strategies, shifting from passive storage to active, structured, and machine-readable systems. As training and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results