Bryan's Notes


  • Home

  • Archives

  • Tags

The Engineering Behind DeepSeek

Posted on 2025-06-05 | In Tech
In early 2025, DeekSeep took the AI world by storm. Here are the brains behind DeekSeep - DeepSeek-AI published its DeepSeek-V3 technical report in late 2024, unveiling a 671 billion parameter mixture-of-experts (MoE) model that activates just 37 billion parameters per token to strike a balance between scale and inference efficiency. Whereas many leading AI labs like OpenAI maintain closed-weight models and release limited technical disclosures, DeepSeek chose transparency, publishing full architecture details and open-source weights to foster community-driven innovation. This openness contrasts with OpenAI’s strategy, which restricts detailed insights into training pipelines and proprietary optimizations, leaving external researchers to reverse-engineer performance gains. By positioning DeepSeek-V3 as a general-purpose foundation model akin to OpenAI’s GPT-4 series, DeepSeek laid the groundwork for downstream specialization, notably through the DeekSeek-R1 reasoning branch.
Read more »
Bryan Chua

Bryan Chua

1 posts
1 categories
6 tags
RSS
© 2025 Bryan Chua
Powered by Jekyll
Theme - NexT.Muse