Recently, my colleague Peter was promoted and happily announced it on LinkedIn. Within a very short time, an interesting phenomenon could be observed under his post, the comments section filled with the following text:
What happened here was clear: LinkedIn offers the possibility to reply to posts with ready-to-use text modules. A change of the job title is interpreted as positive news and therefore the module “Congrats X” is suggested, of which many colleagues made use. Now, this can be dismissed as a harmless way of simply expressing congratulations quickly in the hustle and bustle of everyday life, and technology helps us to do this in digital form. And yet there is something disturbing about this robotic form of communication and we should discuss the long-term impact more widely.
The generation of text or text modules is not limited to the LinkedIn microcosm, especially since the development of BERT and the transformer architecture of neural networks and the hype around GPT-3, more and more large and small companies have begun to integrate speech generation into their products or to cobble complete business models around it. Gmail, for example, offers the possibility of automatically completing text in emails called “smart compose” and the function is surprisingly good:
In this application email, Gmail not only continuously suggested text modules, at the end it even automatically suggested the subject “I want to be part of your team!”. This is astonishing, but it also raises the question of what value this rather good subject is in said application email, if Gmail also suggests this to all other applicants and the recruiter’s inbox is full of emails with the subject “I want to be part of your team!”. Although Gmail offers the option to turn on personalization for “smart compose”, which is supposed to learn your writing style, it is questionable how well the language model with trillions of parameters and trained on billions of texts from the internet can really acquire a personal writing style.
The topic seems even more disconcerting when you look at the development of the last few years, in which many companies have come up with the idea of replacing their customer support with chatbots. Anyone who has developed chatbots knows that there is no special artificial intelligence behind it: Sentences typed by users are assigned to so-called “intents” and programmers must manually assign prefabricated answer or subsequent processes to each intent. It is true that often artificial intelligence language models are used to map sentences to intents but that’s where things go wrong more than once: If the language model predicts the wrong intent for a sentence, this can lead to curious misinterpretations that would not occur in this way if one were actually communicating with a human.
The problem with chatbots is that they simulate human communication, but they are not on the same communication level as humans. Computers can be operated flawlessly with mouse and keyboard, the selection of a date via date picker for example and the insertion of a location in the field “Departure airport” are clear instructions, the written text “Flight Seattle 1-7” on the other hand can be interpreted ambiguously (is it July 1st or 7th of January? Humans would either use the closest date here or ask queries in case they “feel” the situation is not clear).
It will quickly lead to frustration if we’re forced to communicate with a system using textual language when at the same time, we know that it wouldn’t be necessary and just using a few mouse clicks would do the trick as well. Even more when the system misinterprets our commands. Language always expresses emotions and feelings and just by the tone of our words or the choice of our words, our opposite is able to infer how we feel. This interpretation on the other hand will likely make our conversation partner adapt his words and choose them according to the situation. And overall, this adoption of our words and tones to reflect what we talk about gives a communication its “soul” and shows that, for example, we’re happy for a person that he got a new job. There is a different emotion in “Fantastic that you made this career move Peter, let’s toast to it at the next meeting” than in “Congrats Peter”, “Congrats Peter”, “Congrats Peter”. This emotion is what makes language alive. If technology causes this emotion to disappear, we should consider whether this is a positive development.