The risks of deploying AI aren’t just about science-fiction scenarios or rogue machines – they’re often far more practical, and far more immediate.
When an AI system makes a mistake, the consequences can damage operations, destroy customer trust, and harm a brand’s reputation.
We’ve all heard the stories: a chatbot agreeing to sell a product for $1, an information system announcing that the James Webb Telescope captured the first images of an exoplanet, or something as trivial – yet frustrating – as a weather forecast promising sunshine all day only for rain to arrive hours later.
In this article, Aleksandr Chikovani, Chief Systems Engineer at EPAM Systems, Inc., shares his perspective on why these errors are inevitable, what they mean for the businesses that rely on AI, and the strategies he uses to measure, manage, and minimize their impact – before they affect end users.
With over a decade of experience in IT, Aleksandr spent much of his career working on large-scale data systems before moving into AI and machine learning over the past five years.
Today, he specializes in building user-facing AI and GenAI solutions with a strong focus on performance measurement, MLOps best practices, and human-centric design.
He is also a co-author of the 2023 Responsible AI Principles – a practical, collaborative framework created with industry peers to guide safe and effective AI adoption.
“In AI, mistakes aren’t unexpected bugs – they’re inevitable outcomes”
Aleksandr: One of the biggest misconceptions about AI is the expectation of perfection. By its very nature, AI will make mistakes. It’s non-deterministic and will never deliver 100% of correct answers – but that shouldn’t be a dealbreaker.
After all, we hire people knowing they will make mistakes too.
The real objective is to ensure an AI system achieves three things: perform more accurately than a human in the same role, operate at lower cost, and have its errors carefully controlled so performance can improve over time.
These goals are achievable through strong MLOps practices – enabling teams to measure performance before an AI system reaches end users, and to keep monitoring and refining it once it’s in production.
Interviewer: If mistakes are inevitable, how do you decide when an AI system is ‘good enough’ to deploy – and can you explain how MLOps practices help make that decision?
Aleksandr: I like this question – it actually contains part of the answer. Believe it or not, we’re not the first to solve this problem, and there are plenty of established best practices you can find with a quick search.
> What’s often missing, though, is a fundamental mindset shift: in AI, mistakes aren’t unexpected bugs – they’re inevitable outcomes. The questions you should be asking are: What’s the cost of a given mistake? What’s the profit from automating this service? Where’s the balance between investing more time, effort, and money to push performance forward by an extra 0.01% – versus deploying something that’s already ‘good enough’ for production?
> Once you have those answers, the next priority is automated quality measurement. And yes, sometimes the reality is that AI won’t solve every problem – in some cases, automation just isn’t profitable. But when you truly understand your goal, it’s much easier to define quantifiable metrics. Once those metrics are in place, you can measure performance objectively and determine whether deploying the solution will cost you money or make you money.
> Designing evaluation systems can be non-trivial, so it makes sense to start with low-hanging fruit. For example, when I helped develop a chatbot assistant for a large IT company (details are confidential due to NDA), my first step was to talk with business stakeholders to understand precisely what we were trying to automate. I quickly learned that most user questions were about
just a handful of simple topics. Within hours, we had built a small but highly relevant dataset of common questions and “golden answers.” That became the foundation of the MLOps pipeline I was building.
This dataset was invaluable because it let us measure the chatbot’s current quality and compare new versions to see whether they were better – before promoting them to production.
> The next step was to introduce a simple feedback loop. Since users had to be logged in before using the chatbot, I recommended adding a basic thumbs-up/thumbs-down button to each response. This gave users an easy way to signal whether the answer was helpful or not – and gave the team direct, actionable feedback.
Sweta: Once you’ve collected user feedback, what’s the next step? How did you filter out misleading signals and turn that feedback into concrete improvements?
Aleksandr: That’s a brilliant catch – user signals can absolutely be misleading. Sometimes a user might reject a factually correct answer simply because it wasn’t the one they wanted to hear; other times they might “like” an incorrect answer because it sounded convincing.
If you treat this feedback blindly, it can poison your evaluation dataset.
> As I mentioned earlier, this was a greenfield project and we had to move quickly. Our low-hanging fruit was to run a manual review carried out by the development team, validated by the support team.
The results of these reviews were also logged, so that later a separate model could be trained to classify incoming feedback based on this labelled data – ultimately closing the loop with automation.
> There was another important reason for this review step: if a user gave negative feedback, it likely meant they still needed help. In those cases, our internal team could reach out directly, assist the user, and document exactly where the chatbot’s response fell short. That insight fed back into development, guiding improvements in future versions of the system.
“if a user gave negative feedback, it likely meant they still needed help. In those cases, our internal team could reach out directly”
Sweta: It sounds like your process wasn’t just about fixing technical issues – it also had a strong human-centric approach and focus on accountability.
Aleksandr: Indeed! Our goal isn’t to create the shiniest model on the planet – it’s to help the person who actually needs help. That mindset comes directly from Responsible AI principles.
If you’re not familiar with them, it’s well worth investing a few hours to understand the framework and how it guides, or supposed to guide, real-world AI development.
Sweta: I can tell you have a strong connection to Responsible AI. Is there a story behind how you got involved with it?
Aleksandr: Back when the AI hype train was accelerating, I knew we needed something concrete – not a dense academic paper few would read, nor a vague “fight for what’s right” LinkedIn post.
I was a tech lead on a GenAI-research project for a confidential project, and with the help of brilliant colleagues, we distilled our thinking into a concise, practical set of principles.
What I value most is that we placed human-centricity and awareness at the top. For any project – AI or otherwise – start by identifying your true beneficiary and understanding the real problem you’re solving. Once you have those, your direction becomes clear.
> Culture eats strategy for breakfast, so I recommend to grow awareness within your team, and share responsibility and accountability across it. When everyone is aligned, following these
principles stops being a checklist and becomes the natural way you work – and that’s when you get your best results.
Sweta: If you had to leave businesses with one piece of advice about using AI responsibly, what would it be?
Aleksandr: I’d say – use critical thinking, especially in today’s era of GenAI hype. Always think in terms of risks. Set clear, measurable goals, and start by building an evaluation framework. And don’t overcomplicate things: when you’re just starting out, simpler is better.
Later on, you’ll almost certainly need to rework parts of what you’ve built – but that “later” can only happen if you start well and start fast. With an evaluation framework in place, you’ll have the tools you need to know whether you’re heading in the right direction, and you’ll be able to adjust course whenever necessary.
Disclaimer: This article reflects the personal views and experiences of Aleksandr Chikovani. Opinions expressed here are solely Alekandr’s and do not necessarily represent those of any organization

.webp?w=696&resize=696,0&ssl=1)
.webp?w=218&resize=218,150&ssl=1)
.webp?w=218&resize=218,150&ssl=1)


.webp?w=100&resize=100,70&ssl=1)


