May 27, 20265 min read

MSPs Are About to Run Into an AI Margin Problem

AI may not be too expensive to use in IT support. The real problem is that the cost is too hard to see, too hard to attribute, and too easy to bury inside fixed-fee service agreements.

Rich Freeman put his finger on part of this in "AI Model Spending is a Black Hole". It stuck with us because we are seeing the same pattern from the product side while building GenticFlow: AI costs look manageable in a demo, but they become much harder to understand once AI is involved in real support workflows.

A demo usually shows a clean interaction. A user asks a question, the assistant responds, a ticket gets summarized, a technician gets a recommendation, and the cost looks trivial. Real IT support does not work that way. It is messy, uneven, and full of context that does not show up in a slide.

A user does not say "the Print Spooler service is stopped and there are stale .shd files in the queue." They say "the printer is not working." From there, the system has to ask the right questions, identify the affected device, check whether the Spooler is running, inspect its startup type, look for orphan jobs, check the queue state, decide whether the pattern is known, determine whether the fix is safe to run, execute it under policy, verify the service came back up, explain the result to the user, and document what happened.

That is where token cost stops being a line item and becomes a support motion. End-user chat has one cost profile. Technician assistance has another. Device investigation has another. Workflow execution has another. Resolution documentation has another. Five different surfaces, five different cost shapes, all happening at the same time across customers with very different environments.

For MSPs, that is the squeeze. Customers want predictable pricing, but upstream model costs scale with usage and MSP contracts often do not. One customer may sit in long chat conversations that never become tickets. Another may trigger deep investigations all day because their environment is noisy. A third may have a handful of expensive investigations that save hours of technician time. If all of that disappears into a single monthly bill of mystery credits, you do not have a service model. You have a gamble.

The useful question is not cost per prompt. It is cost per useful support outcome.

Did the AI resolve something? Collect context? Prevent an escalation? Save technician time? Did it use the right model for the job? That is the level MSPs will need to operate at, because the wrong AI economics will quietly destroy the margin on a good support service. That has become obvious while building GenticFlow.

The expensive model is not always the right model. The cheapest model is not always cheap either. Weaker tiers fail tool calls, misunderstand context, retry, or produce output a technician cannot trust, and the "cheap" call has now become expensive. We default to smaller models for the tool-heavy work that dominates real support. They sit in the right place on the cost and accuracy curve. Flagship models stay reserved for the genuinely hard reasoning where they earn their price.

The bigger lesson is about routing. The cheapest token is the one you never spend. A lot of what looks like AI work is actually deterministic: restarting the Print Spooler, clearing stale jobs out of the queue, pulling a standard field from a device, running an approved script, or collecting a standard log bundle. None of that needs a model. It needs policy, control, verification, and evidence.

AI should be reserved for the parts of the workflow where judgment is useful: understanding messy user language, deciding what to check, interpreting evidence, choosing the next step, writing the explanation, and documenting the result. Everything else should be a native platform action, executed with no tokens spent at all. That is not just an architectural preference. It is how a fixed-fee support agreement survives a consumption-based AI bill.

That is why MSPs need to look at AI cost in two ways: the surface where the cost is created, and the business outcome it produced. End-user chat, technician assistance, device investigation, workflow execution, and resolution documentation all have different cost profiles. But the reporting also needs to cut across customer, workflow, conversation type, investigation, model, and outcome. Without that, an MSP cannot answer basic commercial questions. Which customers are profitable? Which workflows are too expensive? Which model choices are wasteful? Which AI interactions actually save technician time, and which ones just look impressive in a demo?

This is why AI cost visibility has to move from finance report to product feature. It cannot be something you discover at the end of the month. It has to be visible while the service is being delivered. In GenticFlow, every investigation surfaces the customer, workflow, model used, token usage, and outcome before the technician closes the ticket. That gives the MSP cost attribution at the point of work, not after the margin has already disappeared.

AI will change IT support, but the vendors that matter will not just be the ones with the cleanest demo. They will be the ones that make AI useful, controlled, measurable, and commercially safe for MSPs. A fixed-fee support agreement with invisible consumption-based AI costs is a margin trap. If vendors do not solve that, MSPs may fix the support problem and create a worse one: the margin problem.

Share this perspective

LinkedIn X

Perspectives

May 27, 20265 min read

MSPs Are About to Run Into an AI Margin Problem

AI may not be too expensive to use in IT support. The real problem is that the cost is too hard to see, too hard to attribute, and too easy to bury inside fixed-fee service agreements.

The useful question is not cost per prompt. It is cost per useful support outcome.

Share this perspective

LinkedIn X

MSPs Are About to Run Into an AI Margin Problem

See how this works in GenticFlow.

MSPs Are About to Run Into an AI Margin Problem

See how this works in GenticFlow.