MSPs Are About to Run Into an AI Margin Problem
Not because AI is too expensive. Because AI cost is too hard to see.
Rich Freeman captured this well in a recent piece he called "AI Model Spending is a Black Hole". It stuck with us because we are seeing the same pattern from the product side while building GenticFlow.
AI looks clean in a demo. A user asks a question, the assistant answers, a ticket gets summarized, a technician gets a recommendation, everyone nods. Real IT support does not work like that. It is messy, uneven, and full of context that does not show up in a slide.
A user does not say "the Print Spooler service is stopped and there are stale .shd files in the queue." They say "the printer is not working." Now the system has to ask questions, find the device, check whether the spooler is running, inspect its startup type, look at the spool directory for orphan jobs, check the queue state, decide if this is a known pattern, work out whether the fix is safe to run, run it, verify the service came back up and the queue is clean, explain it to the user, and document what happened.
That is where token cost stops being a line item and becomes a support motion. End-user chat has one cost profile. Technician assistance has another. Device investigation has another. Workflow execution has another. Resolution documentation has another. Five different surfaces, five different cost shapes, all happening at the same time across customers with very different environments.
For MSPs that is the squeeze. Customers want predictable pricing. Upstream model costs scale with usage. Support workload does not. One customer may sit in long chat conversations that never become tickets. Another may trigger deep investigations all day because their environment is noisy. A third may have a handful of expensive investigations that save hours of technician time. If all of that disappears into a single monthly bill of mystery credits, you do not have a service model. You have a gamble.
This is why "cost per prompt" is the wrong question.
The better one is cost per useful support outcome.
Did the AI resolve something? Collect context? Prevent an escalation? Save technician time? Did it use the right model for the job? That is the level MSPs will need to operate at, because the wrong AI economics will quietly destroy the margin on a good support service.
We have learned a few things directly from building this.
The expensive model is not always the right model. The cheapest model is not always cheap either. Weaker tiers fail tool calls, misunderstand context, retry, or produce output a technician cannot trust, and the "cheap" call has now become expensive. We default to mini-tier models for the tool-heavy work that dominates real support. They sit in the right place on the cost and accuracy curve. Flagship models stay reserved for the genuinely hard reasoning where they earn their price.
The bigger lesson is about routing. The cheapest token is the one you never spend. A lot of what looks like AI work is actually deterministic. Restarting the Print Spooler. Clearing stale jobs out of the queue. Pulling a standard field from a device. Running an approved script. Collecting a standard log bundle. None of that needs a model. It needs policy, control, verification, and evidence. AI should be reserved for where it actually adds judgment. Understanding messy user language. Deciding what to check. Interpreting evidence. Choosing the next step. Writing the explanation. Documenting the result. Everything else should be a native action, executed by the platform, with no tokens spent at all. That is not just an architectural preference. It is how a fixed-fee support agreement survives a consumption-based AI bill.
That distinction is going to matter a lot. In modern IT support, AI is no longer just answering questions. It is becoming part of the workflow between the user asking for help and the technician resolving the issue. That workflow touches chat, tickets, devices, commands, policies, approvals, documentation, and customer reporting. Every part of it has a cost. And every part of it needs attribution.
Not one monthly number. Six dimensions.
Without that, an MSP cannot answer basic commercial questions. Which customers are profitable? Which workflows are too expensive? Which model choices are wasteful? Which AI interactions actually save technician time, and which ones just look impressive in a demo?
This is why AI cost visibility has to move from finance report to product feature. It cannot be something you discover at the end of the month. It has to be visible while the service is being delivered. In GenticFlow, every investigation surfaces the model used, the token count, the customer it belongs to, and the outcome it produced, before the technician closes the ticket. That turns AI from a black box into a line item the MSP can actually defend.
Not magic. Not mystery credits. Not "trust us, it is included." Manageable.
AI will change IT support. The winners in the channel will not be the vendors with the best demo. They will be the ones that make AI useful, governed, measurable, and commercially defensible. Otherwise MSPs solve one problem and create another. The support problem gets better, the margin problem gets worse.