Deepseek Moe - Search News

Chain-of-experts (CoE): A lower-cost LLM framework that increases efficiency and accuracy

Chain-of-experts chains LLM experts in a sequence, outperforming mixture-of-experts (MoE) with lower memory and compute costs.

The Daily Cardinal8d

Deepseek introduces new technologies to the AI world

ECE professor Kangwook Lee provides insights on new Chinese AI Deepseek, discussing how it was built and what it means for ...

Analytics India Magazine4d

This Developer Ran the 671 Billion Parameter DeepSeek-R1 Model—Without a GPU

Notably, John Leimgruber, a software engineer from the United States with two years of experience in engineering, managed to ...

How DeepSeek R2 is Making AI Faster, Cheaper, and Smarter

DeepSeek R2 redefines AI with cost efficiency, multilingual support, and open-source tools. Discover how it outpaces GPT-4.

Decrypt13d

Faster Than DeepSeek? Tencent Reignites AI War With Hunyuan Turbo S

Tencent's new model doubles response speed while matching top performers like GPT-4o in reasoning tasks, intensifying the AI ...

R1: A New Frontier In Reasoning AI

Imagine an AI that doesn’t just guess an answer but walks through each solution, like a veteran scientist outlining every ...

ByteDance says new AI technology boosts model training efficiency by 1.7 times

TikTok owner ByteDance said it has achieved a 1.71 times efficiency improvement in large language model (LLM) training, the ...

Gulf Times9d

DeepSeek rushes to unveil new AI model as China goes all in

While competitors like France’s Mistral have developed models based on MoE, DeepSeek was the first firm to depend heavily on this architecture while achieving parity with more expensively built ...

10d

DeepSeek Open Source Week: Open, collaborative AI ecosystem

The Open Source Week initiative launched by Chinese AI startup DeepSeek concluded on Friday with the release of its fifth code repository, showcasing the company's commitment to fostering an open and ...

Decrypt7d

Alibaba's Latest AI Model Beats OpenAI's o1-mini, On Par With DeepSeek R1

Albibab Cloud’s latest model rivals much larger competitors with just 32 billion parameters in what it views as a critical ...

HEDGE FLOW Hedge funds cut China stocks for fourth week as DeepSeek optimism fades

Global hedge funds continued to sell China equities for a fourth straight week as the renewed enthusiasm for Chinese tech ...

Analytics India Magazine8d

Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI’s o1-mini

This remarkable outcome underscores the effectiveness of RL when applied to robust foundation models pre-trained on extensive ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results