New Qwen2.5-Max exceeds DeepSeek capabilities

watch 1m, 10s
views 2

11:54, 31.01.2025

After the releases of Qwen2.5, Qwen2.5-VL, a new version of Qwen2.5-Max has become available. The new version of Qwen shows top performance over the DeepSeek V3 in the following benchmarks - GPQA-Diamond, Arena-Hard, LiveCodeBench, and LiveBench.

Architecture and Model Features

The Max version is a fairly large-scale project of the Mixture of Experts model. The uniqueness of this particular model was in training on real user feedback (RLHF), using Supervised-Fine-Tuning, and of course training on 20 trillion tokens.

At the moment, the data for the new version has not yet been posted on GitHub, only access to the API and Qwen Chat is available for now. There's a good chance that the lack of data on HuggingFace and GitHub indicates a rush to unveil the new project or a planned promotion by the company to incentivize the adoption of their cloud platform.

Qwen has published results regarding the new model. According to the open data table of the new Qwen version compared to LLaMA3.1 and DeepSeek-V3, the Max version outperforms its competitors in most characteristics. When compared to Claude Sonnet and GPT, the Max version loses to GPT.

The company has invested a significant budget in training data, and the superiority over competitors exists, but it is relatively insignificant. Because of this, some experts have the theory that it is possible to extend the capabilities of language models by using computing power during testing. 

Share

Was this article helpful to you?

VPS popular offers

Other articles on this topic

cookie

Accept cookies & privacy policy?

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we'll assume that you are happy to receive all cookies on the HostZealot website.