Over the past several weeks, researchers at AI hub Hugging Face have been working to reverse-engineer the reasoning model that briefly upended stock markets and roiled the tech industry.
Chinese lab DeepSeek was notably up-front with many of the details of how it built R1, but the dataset and some aspects of the training process remain a mystery. Thomas Wolf, Hugging Face co-founder and chief science officer, said the project has been working to fill in these gaps and in doing so, confirmed that the model is indeed the real deal in certain ways.
“We started the project with the idea of testing if their claims were true. Pretty quickly we saw that, yeah, they are true,” Wolf told Tech Brew. “It’s morphed at Hugging Face from a temporary weekend project to a long-term plan for now, at least.”
While DeepSeek’s sudden catapult into public discourse in January hasn’t cratered Big Tech stocks or popped bubbles in the way some investors initially feared, its permissive licensing has catalyzed the open-source AI community. Wolf called it a “ChatGPT moment” in terms of bringing mainstream attention to the sometimes-overlooked open side of AI development.
“It’s moved to this stage where just anyone who is using AI, which nowadays is almost anyone—my grandmother started to ask us, ‘Hey, I heard about open-source AI, with this DeepSeek thing, is it what you do?’” Wolf said.
Hugging Face is releasing regular updates from what it’s calling its Open-R1 project, including one last week around a new competitive programming model that the reverse engineering process has yielded.
Elephant still in room: Nearly two months since DeepSeek captured public attention, it still loomed large in the peripheral conversations at the HumanX conference in Las Vegas last week, where the interviews for this article took place.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
It helps that the DeepSeek debut was followed by a string of other high-profile models from China, including Alibaba’s open-source QwQ-32B, the latest of Baidu’s Ernie models (reportedly set to be open-sourced in coming months), and autonomous agent Manus AI (not open-source).
Ori Goshen, co-founder and co-CEO of Israeli startup AI21, which develops open models for enterprise use, said the developments have shown how Chinese AI companies may see open-source as a competitive advantage.
“What happened in the last several weeks makes me think that China is starting to get a head start on open-source. Because whatever they do, they do it open-source, and we don’t see that volume of open-source activity in the US,” Goshen said.
Open cred: Despite some impressive performance optimizations, David Cox, VP for AI models at IBM Research, said the developments around DeepSeek were misinterpreted in certain ways. The cost figure, in particular, was widely misconstrued, and undersold the likely true amount, he said.
But Cox said the attention has caused more businesses to think more seriously about open-source options, like IBM’s models.
“One of the things that DeepSeek R1 did for us—and for everyone else in the open model community—which was really helpful, is it gives people the thought, like, ‘I’m not missing anything if I go open. If I go open, anything I need, it’s gonna be there. And then I don’t have lock-in,’” Cox said. “The idea that your needs can be serviced by the open community…businesses can feel comfortable choosing that; that’s hugely valuable.”