Challenges and limitations of LLMs

While LLMs have many advantages and may be used for different positive purposes, they also come with their own challenges and limitations like:

  • Misinformation and hallucinations - LLMs may generate false or misleading information. That can have a serious impact especially in fields like healthcare, engineering or law, which require high levels of accuracy. Generating and publishing factually incorrect content can put at risk people’s lives or health, or cause costly damage to equipment.

    For example Microsoft Copilot - Microsoft Bing AI chatbot, gave inaccurate answers to 1 out of every 3 basic questions about candidates, polls, scandals and voting in a pair of election cycles in Germany and Switzerland 2023. In many cases, the chatbot misquoted its sources.

  • Stale data - if LLMs don’t have access to real-time facts and events, the generated answers are only as up-to-date as the training data they were trained on.
  • Bias - depending on what is in the data the LLMs were trained on, the possible answers may be biased or even discriminatory, or apply only to the parts of the world where the majority of training data was harvested.

    An example of such bias was Amazon’s experimental automated recruitment tool that aimed to assess applicants’ suitability for roles by analyzing their resumes. However, it became biased against women as it learned from previous candidates’ resumes who were mainly male.

  • Glitch tokens - LLMs may contain glitch tokens which are specific words or strings that cause them to behave in unexpected and often nonsensical ways, for example repeat certain phrases not related to the question.
  • Lack of transparency and explainability - LLMs often function like a black box with hidden decision making process where it is not traceable how they actually arrived at certain outputs.
  • Information leaks - since LLMs are trained on vast amounts of data, they may contain also private data and sensitive information.
  • Plagiarism - LLMs are prone to pick content without tracking where it was initially created, thus they may violate copyright and intellectual property.

    An example here is the case of The Times newspaper, which accused Microsoft and OpenAI of using its articles to train artificial intelligence, including GPT chat, thus infringing their copyrights.

  • Lack of accountability - as LLMs become more advanced, it is more challenging to attribute who should be held accountable for the potential harmful outputs they produce.
  • Unethical use - LLMs may be used to manipulate or deceive people, for example through creating deepfakes, phishing attacks or social engineering schemes.
  • Skill degradation - if not used properly, LLMs may contribute to skill reduction in students and employees who rely overly on LLMs for writing content or code, which in turn decreases their learning, writing and programming skills.
  • Job displacement - innovations like LLMs and, more generally, AI are expected to drastically change or replace some of the job roles.
  • Impact on the environment - developing LLMs requires considerable computational resources, which causes energy consumption and a large carbon footprint.

    Apart from many undeniable advantages, LLMs, and AI in general, will also bring some unexpected risks.

    Personally, I think, that especially the big players in the field of AI, like OpenAI, Google, Amazon etc., will have to demonstrate responsibility in their next steps. "Responsible AI" shouldn't be just an empty slogan. In order to ensure that, there is a need to keep an eye on the key players, so that they don't start to dangerously resemble the Lord Farquaad from the "Shrek" movie , with his famous quote Some of you may die, but it's a sacrifice I am willing to make.

screenshot from the movie 'Shrek'

source: screenshot from the movie 'Shrek'