Every few weeks another vendor announces that they have added AI to their product. A chatbot here, a summarisation feature there, a copilot bolted onto an existing workflow. The pace is real, and so is the pressure. If you run technology for a hospital, a government agency, or a bank, you are probably being asked when your software will do the same.
Here is the thing most of that noise misses. Putting a large language model into enterprise software is no longer the difficult part. The models are good, the APIs are mature, and the tooling on platforms like Azure has caught up fast. The genuinely hard work starts after the demo, when you have to make the thing safe, accountable, and able to survive contact with a regulator.
We have been here before with language
There is a useful way to think about this moment. The way we talk to machines has kept moving up the ladder of abstraction. We started with machine code, raw numbers. Then assembly gave those numbers names. C and C++ let us think in structures instead of registers. Scripting languages like Python let us skip whole categories of plumbing. Each step traded precise control for speed and reach.
Natural language is the next rung. You can now describe what you want in plain English and have software help build it. That is a real shift, and it deserves the attention it gets. But every previous rung came with a catch, and this one is no different. The higher you climb, the further you sit from what is actually happening underneath. With AI, that gap is wider than anything we have dealt with before, because the system can be confidently wrong in ways that look exactly like being right.
That is why the integration story matters more than the model story. A model that gives a plausible answer is easy. A model that gives an answer you can trust, trace, and defend to an auditor is a different engineering problem entirely.
Scalable automation cuts both ways
The promise everyone is chasing is scale. Automate the repetitive work, free up your people, handle ten times the volume without ten times the headcount. All of that is achievable, and we have seen it work. A claims process that took days can take minutes. A backlog of support tickets can shrink overnight.
But automation scales your mistakes just as fast as your wins. If a human gets a decision wrong, it affects one case. If an automated system gets it wrong because the prompt was loose or the data was stale, it can affect thousands before anyone notices. In a regulated setting, that is not a bug report. That is a breach, a complaint, or a headline.
So the question we ask clients is not "can we automate this". It is "what happens when this is wrong, and how quickly will you know". The teams who get value from AI are the ones who build the boring guardrails first. Logging. Human review on the cases that matter. Clear boundaries on what the model is allowed to touch. Data that never leaves the country it is supposed to stay in.
Governance is the feature, not the friction
There is a tendency to treat compliance as the thing that slows AI down. In Australian government, healthcare, and financial services, frameworks like IRAP, the Essential Eight, and ISO 27001 are sometimes seen as obstacles to ship around. We see it the other way. Those frameworks are a decent checklist for building systems that do not embarrass you later.
When you integrate a language model into a real workflow, you have to answer questions a good compliance regime was already asking. Where does the data live. Who can see it. How do you prove a decision was made fairly. Can you turn it off cleanly. AI does not change those questions. It just makes them more urgent, because the system now acts faster and with more autonomy than the software it replaced.
This is also where data sovereignty stops being a slogan and starts being a design decision. If your customer records or patient data are flowing through a model, you should know exactly which region that model runs in. For a lot of our clients, hosting in Azure Australia East is not a preference. It is a requirement, and it shapes the architecture from day one.
What we actually recommend
Start with one workflow where the value is obvious and the risk is contained. Build the AI in with the controls already wired through it, not added afterward. Treat the model as one component in a larger system rather than the system itself. Measure not just how much faster you go, but how often you have to step in and correct it.
The firms that win the next few years will not be the ones who shipped AI features first. They will be the ones whose AI features were still standing after the audit, the incident, and the regulatory change. The technology is the easy part now. The judgement around it is where the real work lives, and that is the part worth getting right.
If you are weighing up where AI fits in your own systems, the most useful first conversation is rarely about the model at all. It is about what you are willing to automate, what you need to keep an eye on, and how you would prove any of it to someone who asks hard questions. Get that clear, and the rest follows.