Summary:
DeepSeek is labeled as the biggest dark horse in the open-source LLM sector for 2025.
DeepSeek V3 features 671 billion parameters and was developed at a cost of $5.58 million.
Resource constraints led to spectacular innovations in DeepSeek's approach.
Chinese AI firms are advancing despite US sanctions on semiconductors.
Introducing DeepSeek
DeepSeek, a startup based in Hangzhou, has rapidly emerged as a significant player in the open-source large language model (LLM) sector, being dubbed as “the biggest dark horse” for 2025 by Jim Fan, a senior research scientist at Nvidia.
Groundbreaking Release
This recognition follows the recent launch of DeepSeek V3, which has made headlines within the artificial intelligence (AI) community. Jim Fan highlighted the innovative approach taken by DeepSeek in developing this model, stating that resource constraints can lead to spectacular reinventions.
Impressive Specs and Cost-Effectiveness
The DeepSeek V3 model boasts an impressive 671 billion parameters and was developed in just two months at a cost of $5.58 million. Remarkably, this was achieved using significantly fewer computing resources compared to larger tech giants like Meta Platforms and OpenAI.
The Importance of LLMs
LLMs are crucial for enabling generative AI services such as ChatGPT, where a higher number of parameters allows for better adaptation to complex data patterns and more accurate predictions. The open-source nature of DeepSeek V3 provides public access to its source code, empowering third-party developers to enhance and share its capabilities.
Resilience Amid Challenges
DeepSeek’s ability to produce a powerful LLM at a lower cost than established competitors illustrates the advancements made by Chinese AI firms, even in the face of US sanctions that hinder access to cutting-edge semiconductors necessary for model training.
Jim Fan, a senior research scientist at semiconductor design giant Nvidia, has been closely following developments at AI startup DeepSeek. Photo: SCMP
Comments