Crypto M - Crypto News

🚀 Alibaba's Qwen Team Launches Advanced Language Model for Coding

Alibaba's Qwen team has unveiled Qwen3-Coder-Next, an open-source language model designed for coding agents and local development. According to Foresight News, this model enhances the scale of intelligent agent training, utilizing 800,000 verifiable tasks and executable environments. It achieved impressive results in the SWE-Bench Pro test with 80 billion total parameters and 3 billion active parameters. Qwen3-Coder-Next supports various applications, including OpenClaw, Qwen Code, Claude Code, web development, browser usage, and Cline.

#Alibaba #QwenTeam #LanguageModel #Coding #OpenSource #IntelligentAgent #SWEBenchPro #Qwen3CoderNext #TechInnovation #AI #MachineLearning

8 views02:54

Crypto M - Crypto News

🚀 OpenAI Identifies Flaws in SWE-bench Coding Benchmark

OpenAI has revealed significant issues with the SWE-bench Verified coding benchmark, which is widely used to evaluate AI models. According to NS3.AI, the benchmark's reliability is compromised due to task contamination and training data leakage, allowing models to memorize solutions rather than genuinely solving tasks. OpenAI recommends transitioning to the more robust SWE-bench Pro and is working on developing new private evaluation methods to better assess AI coding capabilities.

#OpenAI #SWEbench #codingbenchmark #AImodels #taskcontamination #trainingdataloss #SWEbenchPro #AIcoding

4 views21:39

Crypto M - Crypto News

🚀 AI TRENDS | Zhipu AI Increases Price of New GLM-5.1 Model by 10%

Zhipu AI (02513.HK) has announced the release of its new flagship model, GLM-5.1, with a 10% price increase. According to Jin10, this model is the only open-source model capable of sustaining operations for eight hours. In the SWE-bench Pro benchmark test, which closely simulates real software development, GLM-5.1 became the first domestic model to surpass Opus 4.6. OpenRouter indicates that with this release, Zhipu AI has again raised the price of its GLM model by 10%. Following the price adjustment, the token price for cache hits in the Coding scenario of GLM-5.1 is now comparable to that of Claude Sonnet 4.6 by Anthropic. This marks the first time a domestic large model has achieved price parity with leading overseas manufacturers in core scenarios.

#AI #ZhipuAI #GLM51 #PriceIncrease #OpenSourceModel #BenchmarkTest #SWEbenchPro #Opus46 #CodingScenario #ClaudeSonnet46 #Anthropic #DomesticModel #TechTrends

2 views02:44

About

Blog

Apps

Platform