With headlines suggesting AI will be taking over human jobs imminently, it may be a relief to see that AI chatbots aren’t exactly getting everything right. But digging deeper into some of the issues, it may not be as comforting as it first appears.
In 2022, a customer asked the Air Canada chatbot to explain the airline’s bereavement policy. The chatbot supplied a link to the policy and said:
“Air Canada offers reduced bereavement fares if you need to travel because of an imminent death or a death in your immediate family…If you need to travel immediately or have already travelled and would like to submit your ticket for a reduced bereavement rate, kindly do so within 90 days of the date your ticket was issued by completing our Ticket Refund Application form.”
While sounding reasonable, this was incorrect. The policy on the webpage stated “Please be aware that our Bereavement policy does not allow refunds for travel that has already happened.”
When the customer followed the chatbot’s advice, they were refused the refund by the customer service humans who knew the policy. When the customer was not satisfied with Air Canada’s offer (a $200 flight voucher) they took the airline to the small claims court. And when Air Canada argued that it should not be held liable for the information provided by the chatbot, the judge disagreed.
So what lessons should we learn from this?
This is becoming increasingly common knowledge but a large language model is only predicting which words are most likely to go together in response to the question. With no understanding of the policy, it does not ‘know’ the right answer, just a most likely one. Even if it has been trained on the actual policy document, the algorithm may determine the most likely answer is something else.
The customer could have checked the linked webpage. Air Canada said they should have. The judge ruled that Air Canada was responsible for the content of its website, including the chatbot’s advice. The airline owed a duty of care to the customer who had relied on its information.
So if you build an AI agent and launch it into action, you need to take responsibility for its output. Or make it very clear if the user cannot rely on its output and needs to check everything. This call for transparency is a key part of the EU AI act, to make sure users are aware they are interacting with an AI agent and are made aware of limitations around its advice.
In this case, that hadn’t been made clear. So the customer didn’t have to take responsibility for checking the chatbot’s answer. But this is not always the case. In 2023 lawyers used ChatGPT to provide past cases to support their court filings and unintentionally included completely fictitious cases and past judgements. The lawyers were deemed negligent for using the output without checking, and fined.
As a professional, if you are using chatbot output in your work, it is your responsibility to check everything before you use it, whether it is in plain English or computer code. I know from experience that asking ChatGPT what a piece of code does may result in a plausible but incorrect answer. When I replied to ask if the code actually did something slightly different, it thanked me for my patience and appeared to correct itself. This new answer was still wrong. By the time I had run the code enough times to figure out how it worked, I might as well not have asked ChatGPT in the first place.
If ChatGPT is writing code for me, I definitely need to check it. And the more complex the code, the more carefully I need to check. The less I know about a topic, the more I may want to rely on the AI answer, but the greater the risk in doing so, as I am less likely be able to discern whether it has given me a correct answer, an oversimplification, or a completely invented hallucination.
On the one hand, the AI failed to give a correct answer, which cost the company money and produced a disgruntled customer. But it isn’t entirely fair to judge the AI against the benchmark of perfection. A human customer services agent won’t be perfect. It is more useful to compare the chatbot’s performance with that of the human agents.
And a lot of reports suggest that overall chatbots:
In which case we shouldn’t be surprised if companies want to replace some staff with AI agents.
But do employers want human agents who can only regurgitate information? Some of the value of humans is their ability to understand the situation. If an agent had read the policy and challenged it as being impractical for customers going through a bereavement, and suggested they allow retrospective refunds – particularly if policy is more in line with the Air Canada culture, that employee might be rewarded for applying their ingenuity to improve the service.
When a chatbot makes up a policy in a customer-facing environment, it might be inconvenient to the company. But used in the right way, AI can offer new insights, unexpected connections and creative ideas. So your job in its entirety may not be replaced by an AI agent. But you may be replaced with an actuary who can use AI to enhance their own skills.