Matching R1 reasoning yet 20x smaller
QwQ-32B, from Alibaba Qwen team, is a new open-source 32B LLM achieving DeepSeek-R1 level reasoning via scaled Reinforcement Learning. Features a "thinking mode" for complex tasks.