88% of organizations are already using AI in at least one business function. And yet most report little measurable impact. 62% are still stuck in pilot mode. That tension opened the first edition of Frends Forward, a new webinar series designed around a simple premise: tight, practitioner-led presentations followed by direct, unscripted conversation. No vendor theatre. Just what's real.
Three speakers. Three layers of the same problem: governance, infrastructure and product. One message running through all of it.
Luc Brandts, CEO of Software Improvement Group (SIG), opened the session with a question most organisations would rather not ask: do you actually know where AI is being used in your systems?
SIG scans billions of lines of code every week across hundreds of technologies. What they see in practice is not a technology problem. It's a visibility problem.
"The question is no longer if AI is there," said Luc. "The question is where it is and what do I do with it."
That distinction matters more than it first appears. AI sprawl — AI being introduced left, right,and across an organisation without a systematic view of where it lives or what risks it carries — is increasingly the norm. And with European energy utilities facing over 1,500 cyber attacks per week in 2024, and new vulnerabilities in widely adopted AI tooling being flagged in real time, the consequences of not knowing are no longer theoretical.
SIG's code analysis tells a concrete story. Using their capability to identify whether code is AI-generated or human-written, the data shows a consistent pattern: AI-generated code scores measurably lower on maintainability than human-written code — and the gap widens significantly as systems grow in complexity.
The evidence comes from three real projects. FastRender, built by a swarm of coding agents with no human in the loop, produced a browser engine of over 3 million lines of code in a week — technically extraordinary, practically unusable, scoring 1.1 out of 5 on maintainability. An autonomous LLM-built C compiler came in at 1.9. OpenClaw, built with AI assistance but with humans actively in the loop, scored 3.1.
"You clearly see here that to get good code, you need to do more than just automatically generate it," said Luc.
His prescription was practical: eliminate AI code blind spots, add quality and security guardrails, help AI understand your architecture before it generates, and, most importantly, put a human in control. Not just in the loop. In control.
The Gartner number is striking. More than 85% of AI projects never progress from pilot to production.
Asmo Urpilainen, CTO of Frends, has seen this pattern repeat across enterprise customers, and his explanation cuts through the usual excuses. Pilots work because they are designed to work: limited datasets, single users, no governance, no edge cases. The moment those conditions change, everything built on top starts to break.
"Without trust, even the most advanced AI model can never be utilized to its full potential," said Asmo Urpilainen.
He identified three failure modes that are consistently responsible.
Prompt injection: where AI is manipulated into behaviour it was never intended to perform, as demonstrated at DEF CON when researchers hijacked Microsoft's Copilot agents and extracted Salesforce records simply by exploiting open system access.
Hallucinations: where AI delivers confident, incorrect answers that cascade through downstream decisions.
Black-box execution: where there is no audit trail, no visibility into why the AI did what it did and no way to build or demonstrate trust.
Moving from pilot to production means solving all three at once. And that is where integration architecture enters the picture.
Asmo drew an analogy that landed clearly: APIs standardized how applications connect to each other. MCP (Model Context Protocol, an open standard developed by Anthropic) is doing the same for AI. It gives AI a governed, standardized way to access tools, systems and data. Build it once, govern it centrally, reuse it everywhere.
The practical illustration came from a story about a Finnish home automation enthusiast who granted an LLM direct access to his sauna thermostat. For a few days, the system worked perfectly. Then the AI quietly switched its temperature units from Celsius to Fahrenheit. The thermostat, lacking any context about the change, attempted to heat the sauna to 90°F instead of 90°C. No danger was posed, but the point was sharp: an MCP tool with a guardrail defining an allowable temperature range would have caught it instantly.
While the story might be comical, the root of the problem remains the same. If you scale that to an ERP system or a patient data workflow, for example, the consequences are very different.
Integration platforms like Frends bring this into practice and provide what is missing from most AI deployments:
validation and guardrails
full audit trail
scalable error handling
multi-user governance.
They do this not as a theoretical framework, but as infrastructure that has been solving exactly these problems for enterprise IT for fifteen years.
The final perspective came from the product layer. Jani Vertanen, Head of Product (AI) at Visma Aquila, has spent thirty years in software, from writing code to running sales to product management. His message was the most grounded of the day, and the most useful for anyone trying to understand why enterprise AI is still so hard to ship.
"Customers do not buy AI. They buy a better outcome," says Jani.
He hears the question constantly from Visma customers: "Is this feature including AI?" And when he digs into what they actually want, the answer is almost always the same. Does it save time? Does it reduce errors? Does it make the process smoother? Do I get a better result? Those are the questions. AI is a means, not the end.
This led him to a point he had rarely admitted aloud: not everything should use AI. Rule-based and deterministic logic is often faster, cheaper, more reliable and more predictable. AI can be uncertain, slow, expensive and failure-prone in everyday situations — and end users waiting for an LLM response time out that runs longer than expected do not know, or care, that there is a model behind the delay. They just know your product is slow.
"The real question is not where we can use AI," said Jani. "It's where does AI improve the outcome enough to justify the added complexity."
M2 Tarkka, Visma Aquila's AI-enhanced travel and expense module, was built on exactly that logic. The problem is visible and practical: missing receipts, wrong expense classifications, unclear combinations of train fares. The AI improves the process at the earliest useful moment, validating claims as they are submitted rather than surfacing errors after the fact. The results are concrete: a 25% reduction in claim returns, faster approval cycles, less frustration for end users, approvers and finance teams.
The trust is earned because the value is visible. Not because AI was added to something.
Jani was equally direct about the production discipline required to get there. Piloting is easy. Presentations and proof of concepts are easy. Production means answering questions that most AI projects never ask: what happens if the LLM is unavailable? What happens if response time degrades? What happens if usage grows and token costs triple? How do you allocate those costs fairly between large and small customers?
"If you are not thinking about this from day one, your own company finance will hate this solution," he said. "That can be a very short love story."
He drew the same line as both Luc and Asmo between fast experiments and real product development. Vibe coding has genuine value for prototyping and proof of concept. It is not the same as AI-driven enterprise software development, which still requires architecture, security, traceability and clear ownership.
What AI changes is the distance between the person who understands the problem and the code that solves it with fewer handovers, less interpretation loss, faster iteration toward what the customer actually needs.
Three speakers came to the same place from three different directions.
Luc showed that AI code without human oversight degrades in quality at scale, and that the answer is not to slow down, but to build systems that keep humans in control while maintaining speed. Asmo showed that integration infrastructure is not a supporting detail but the governance layer that determines whether AI can ever reach production at all. Jani showed that the discipline required to ship AI in enterprise products is fundamentally the same discipline that good software development has always required, and that trust is earned through outcomes, not by adding AI to a feature list.
The organizations that are winning with AI are not the ones with the most pilots or the most ambition. They are the ones that treat AI as a serious operational capability: governed, integrated, measured and built on foundations that can actually hold the weight.
Watch the full Frends Forward webinar on demand.
Explore the Frends iPaaS platform for production AI at frends.com/ipaas/agentic-ai.