Capability
Sparse Mixture Of Experts Instruction Following
12 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “sparse mixture-of-experts architecture with 37b active parameters”
Open-source reasoning model matching OpenAI o1.
Unique: Uses sparse MoE with 37B active parameters out of 671B total, reducing per-token compute compared to dense models while maintaining frontier reasoning capability. Specific routing and load balancing mechanisms are proprietary/undocumented.
vs others: More efficient than dense models of equivalent capability (e.g., 70B dense) due to sparse activation, but exact latency/throughput improvements are undocumented.