DeepSeek V3: record results in benchmarks

watch 47s
views 2

13:47, 26.03.2025

Article Content
arrow

  • Programming and Math Benchmarks
  • Model Updates

The new DeepSeek model appeared without announcement on HuggingFace. Only after a day did a detailed announcement with a description become available.

Programming and Math Benchmarks

DeepSeek-V3-0324 shows record-breaking performance and scores significantly higher than DeepSeek-V3 in all of the following categories:

  • AIME: 59.4
  • MMLU-Pro: 81.2
  • LiveCodeBench: 49.2
  • GPQA: 68.4

Also, in most results, V3-0324 scores better than Claude 3.5.

DeepSeek noted that their new product also outperforms Claude 3.7. After this announcement, there were rumors about a possible training of the new model on the Claude 3.7. There is no confirmation or denial of this information at this time.

Model Updates

Regarding the main updates, they relate to code improvements, and certain changes to the game interfaces and web pages. In addition, the quality of Function Calling has been changed.

Also, the new project has a good base on web search results processing and file reading. In addition to this, the new model has been tested and runs fine on the Mac Studio.

Share

Was this article helpful to you?

VPS popular offers

Other articles on this topic

cookie

Accept cookies & privacy policy?

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we'll assume that you are happy to receive all cookies on the HostZealot website.