How DeepSeek will upend the AI industry — and open it to competition

How DeepSeek will upend the AI industry — and open it to competition

Chinese start-up DeepSeek’s cost-saving techniques for training and delivering generative AI (genAI) models could democratize the entire industry by lowering entry barriers for new AI companies.

DeepSeek made waves this week as its chatbot overtook ChatGPT downloads on the Apple and Google App Stores. The open-source AI model’s impact lies in matching leading US models’ performance at a fraction of the cost by using compute and memory resources more efficiently.

DeepSeek is more than China’s “ChatGPT”; it’s a major step forward for global AI by making model building cheaper, faster, and more accessible, according to Forrester Research. While large language models (LLMs) aren’t the only route to advanced AI, DeepSeek’s innovations should be “celebrated as a milestone for AI progress,” the research firm said.

The efficiencies of DeepSeek’s AI methodology means it requires vastly less compute capacity on which to run; that means it could also affect the chip industry, which has been riding a wave of GPU and AI accelerator hardware purchases by companies building out massive data centers.

For example, Meta is planning to spend $65 billion to build a data center with a footprint that’s almost as large as Manhattan. Expected to come online at the end of this year, the data center would house 1.3 million GPUs to power AI tech used by Facebook and other Meta ventures.

Brendan Englot, a professor and AI expert at Stevens Institute of Technology in New Jersey, said the fact that DeepSeek’s models are also open source will also help make it easier for other AI start-ups to compete against large tech companies. “DeepSeek’s technology provides an excellent example of how disruptive and innovative new tools can be built faster with the aid of open source software,” said Englot, who is also director of the Stevens Institute for Artificial Intelligence (SIAI).

DeepSeek’s arrival on the scene tanked GPU-leading provider Nvidia’s stock, as investors realized the impact the more efficient processes would have on AI processor and accelerator sales.

“DeepThink” a feature within the DeepSeek AI chatbot that leverages the R1 model to provide enhanced reasoning capabilities, uses advanced techniques to break down complex queries into smaller, manageable tasks.

Thanks to those kinds of optimizations, DeepThink (R1) only cost about $5.5 million to train — tens of millions of dollars less than similar models. While this could reduce short-term demand for Nvidia, the lower cost will likely drive more startups and enterprises to create models, boosting demand long-term, Forrester Research said.

And, while the costs to train AI models have just declined significantly with DeepThink, the cost to support inferencing will still require significant compute and storage, Forrester said. “This shift shows that core AI model providers won’t be enough, further opening the AI market,” the firm said in a research note. “Don’t cry for Nvidia and the hyperscalers just yet. Also, there might be an opportunity for Intel to claw its way back to relevance.”

Englot agreed, saying there is a lot of competition and investment right now to produce useful AI software and hardware, “and that is likely to yield many more breakthroughs in the very near future.”

DeepSeek base technology isn’t pioneering. On the contrary, the company’s recently published research paper shows that Meta’s Llama and Alibaba’s Qwen models were key to developing DeepSeek-R1 and DeepSeek-R1-Zero — its first two models, Englot noted.

In fact, Englot doesn’t believe DeepSeek’s advance poses as much of a threat to the semiconductor industry as this week’s stock slide suggests. GenAI tools will still rely on GPUs, and DeepSeek’s breakthrough just shows some computing can be done more efficiently.

“If anything, this advancement is good news that all developers of AI technology can take advantage of,” Englot said. “What we saw earlier this week was just an indication that less computing hardware is needed to train and deploy a powerful language model than we originally assumed. This can permit AI innovators to forge ahead and devote more attention to the resources needed for multi-modal AI and advanced applications beyond chat-bots.”

Others agreed.

Mel Morris, CEO of startup Corpora.ai, said DeepSeek’s affordability and open-source model allows developers to customize and innovate cheaply and freely. It will also challenge the competitive landscape and push major players like OpenAI — the developer of ChatGPT — to adapt quickly, he said.

“The idea that competition drives innovation is particularly relevant here, as DeepSeek’s presence is likely to spur faster advancements in AI technology, leading to more efficient and accessible solutions to meet the growing demand,” Morris said. “Additionally, the open-source model empowers developers to fine-tune and experiment with the system, fostering greater flexibility and innovation.”

Forrester cautioned that, according to its privacy policy, DeepSeek explicitly says it can collect “your text or audio input, prompt, uploaded files, feedback, chat history, or other content” and use it for training purposes. It also states it can share this information with law enforcement agencies [and] public authorities at its discretion.

Those caveats could be of concern to enterprises who have rushed to embrace genAI tools but have been concerned about data privacy, especially when it involves sensitive corporate information.

“Educate and inform your employees on the ramifications of using this technology and inputting personal and company information into it,” Forrester said. “Align with product leaders on whether developers should be experimenting with it and whether the product should support its implementation without stricter privacy requirements.”

close chatgpt icon
ChatGPT

Enter your request.