Ai2’s New AI Model Outperforms DeepSeek’s Top Ranker

Seattle-based AI research institute, Ai2, recently announced that its newly launched AI model, Tulu3-405B, surpasses DeepSeek’s V3 model. Hailed as a significant breakthrough for U.S. open AI, it even outdoes OpenAI’s GPT-4o in some AI benchmarks, based on Ai2’s assessment.

Unlike its competitors, Tulu3-405B boasts of an open-source structure, offering unrestricted and permissive accessibility to its core components. This feature emphasizes the U.S.’ potential to spearhead top-notch generative AI model development globally.

The Tulu3-405B model is considerably comprehensive, encompassing 405 billion parameters and needed 256 concurrent GPUs for successful training. A key component of its performance is the incorporation of ‘reinforcement learning with verifiable rewards’, focusing on tasks with verifiable outcomes, such as mathematical problem-solving and instruction following.

Ai2’s model, on performance metrics by PopQA, a compilation of detailed knowledge questions from Wikipedia, outdid not just DeepSeek V3 and GPT-4o, but also Meta’s Llama 3.1 405B model. Moreover, it exhibited top performance amongst its peers on GSM8K, a test encompassing elementary-level math word problems.

The Tulu3-405B model can be assessed via Ai2’s chatbot web app, while the code for training the model is available on GitHub and the AI dev platform, Hugging Face.

Original source: Read the full article on TechCrunch