DeepSeek's R1 model is reported to have been trained at a cost of only $5.5 million, a staggering 35 times less than the $200 million it cost to train OpenAI's GPT-4.
This cost reduction is attributed to several key technical innovations, such as using 8-bit numbers instead of 32-bit, predicting multiple words simultaneously, utilizing compressed data, and sharing parameters across different models.
These techniques contribute to a 45x improvement in efficiency, making DeepSeek's API pricing significantly cheaper than its competitors. Another notable achievement of DeepSeek is the development of reasoning capabilities within the model itself, enabling it to generate long chains of thought, verify its work, and allocate compute power to harder problems. This helps mitigate the problem of "hallucinations" in large language models.
These innovations have significant implications for investment. The article suggests two possible scenarios: either companies will spend less on AI due to the reduced costs, or they will spend even more because AI will become more efficient.
Experts identifie four stocks poised to benefit from this evolving landscape:
- CrowdStrike: This cybersecurity firm saw its stock surge after DeepSeek experienced a cyberattack. Their Falcon platform offers robust endpoint detection and response capabilities.
- Palantir: Known for its AI software platforms, Palantir could integrate DeepSeek's innovations into its AI platform, AIP, making its technology more accessible.
- Meta Platforms: A major investor in AI infrastructure, Meta plans to integrate DeepSeek's advancements into its open-source Llama family of models. Meta believes that the innovations of DeepSeek show the power of test-time scaling, where AI models use more computing power to address complex issues.
- Taiwan Semiconductor (TSMC): As the primary manufacturer of advanced microchips, TSMC stands to gain from increased demand for AI hardware. TSMC's custom manufacturing processes and network effects give it a strong market position.
The increased adoption of AI may lead to greater cybersecurity threats, which further benefits companies like CrowdStrike. Additionally, the article mentions Jevon's paradox, noting that as something gets cheaper, demand rises faster, often resulting in increased overall spending. This is also seen in Moore's law, where chip prices fall, but overall spending on chips increases.
DeepSeek’s innovations will not reduce overall AI spending but rather increase it. The accessibility of large language models will grow, and hyperscalers will boost data center capacity. The overall expectation is that both software and hardware will thrive.
No comments:
Post a Comment