Speculative decoding can help AI chatbots improve throughput and reduce hardware demand by using a smaller model to draft tokens that a larger model validates.
Professors Emma Alexander, Manling Li, Han Liu, Marcelo Worsley, and their students represented Northwestern CS at CVPR 2026 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results