Architecture Paper Models

16don MSN

DeepSeek kicks off 2026 with paper signalling push to train bigger models for less

DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning ...

16d

DeepSeek’s New Architecture Can Make AI Model Training More Efficient and Reliable

DeepSeek, the Chinese artificial intelligence (AI) startup, that took the Silicon Valley by storm in November 2024 with its R1 AI model has now revealed a new architecture that can help bring down the ...

17h

Why reinforcement learning plateaus without representation depth (and other key takeaways from NeurIPS 2025)

Why reinforcement learning plateaus without representation depth (and other key takeaways from NeurIPS 2025) ...

Analytics India Magazine

New DeepSeek Research Shows Architectural Fix Can Boost Reasoning at Scale

DeepSeek has released new research showing that a promising but fragile neural network design can be stabilised at scale, delivering measurable performance gains in large language models without ...

WinBuzzer

DeepSeek Reveals R1 Model Architecture Secrets Ahead of V4 Model Launch

DeepSeek has expanded its R1 whitepaper by 60 pages to disclose training secrets, clearing the path for a rumored V4 coding ...

SiliconANGLE

DeepSeek develops mHC AI architecture to boost model performance

DeepSeek researchers have developed a technology called Manifold-Constrained Hyper-Connections, or mHC, that can improve the performance of artificial intelligence models. The Chinese AI lab debuted ...

16don MSN

DeepSeek proposes shift in AI model development with 'mHC' architecture to upgrade ResNet

The paper comes at a time when most AI start-ups have been focusing on turning AI capabilities in LLMs into agents and other products DeepSeek's latest technical paper, co-authored by the firm's ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results