Why is the API for GLM-5.1 more expensive than GLM-5 when the model size is the same?
Hi team and community,
I noticed that the API pricing for GLM-5.1 is higher than GLM-5 on the Z.ai platform:
GLM-5.1: Input $1.4 / Output $4.4
GLM-5: Input $1 / Output $3.2
As far as I know, both models share the same architecture and parameter size (744B total, 40B active MoE).
So my question is: Why the price increase?
Is the inference efficiency worse due to defaults like Thinking Mode or agentic optimizations? Or is it purely a business decision (value-based pricing) because GLM-5.1 is highly optimized and much smarter via post-training?
What puzzles me most is that since GLM-5 and GLM-5.1 share the same architecture and parameter size, the inference cost (hardware requirement) should be identical. In an open-source ecosystem, anyone hosting the model would simply replace 5 with 5.1 at zero additional operational cost.
Therefore, choosing 5 over 5.1 just because it's 'cheaper' seems fundamentally irrational from a purely technical standpoint. Is this API pricing strictly a business strategy (value-based pricing to recover R&D costs), or is there an invisible technical overhead in 5.1 that I'm missing?"
I'd love to hear the technical or strategic reasons behind this. Thanks!
Yes, I also wonder why GLM-5 shares the core technology DSA with and having a comparable size with DeepSeek-V3.2 (744B-A40B vs 671B-A37B) but is several times the price of the latter, it might be purely commercial considerations. (as you can notice that almost all providers on OpenRouter match their price to the official's)
I suspect there might ( not sure) be 2 reason for this :
1)Chinese computation is much cheaper (due to abundance of energy and subsidy)....even though amarican chips are better .... , So American servers (like openrouter) easily gets undercut in front of Chinese computation ...
2) Data War: using point-(1) as leverage .....Chinese Companies are aggressively selling their own API/Openclaw services (even at a loss) [...that's one of the reasons that some Chinese models are getting proprietary(like glm-turbo series)].......so if you don't want to pay premium ...grab their CODING-plan🤓.