Enhancing Language Models | #AI23
By now, we all have realized that mere language models come with limitations. Although they may be impressive when it comes to solving certain problems, they are also impressively bad at ordinary tasks such as solving simple math problems. In recent months however, we saw a new phenomenon occurring. Large Language Models are increasingly combined with conventional software solutions to circumvent the difficulties of a mere deep learning approach.This development opens the gates for a whole array of exciting new systems.
In the current AI hype, many people misunderstand what is happening and what is new about systems like ChatGPT. ChatGPT has brought the hype around foundation models to a greater public. Inside the AI scene, however, this hype started at least in 2020, when GPT-3 was released. From a technological perspective, there is nothing fundamentally new about ChatGPT. The system combines reinforcement learning from human feedback, Instruct-GPT and GPT-3 as the underlying foundation model. With a well-designed user interface, ChatGpt made the fruits of NLP accessible to the general public for the first time. The next generation of AI systems will be applications that cleverly combine foundation models with different functionalities, giving those systems abilities that exceed the mere generation of language and making them valuable tools for end consumers.
If we look at Bing Chat, the current new thing, we must acknowledge that, besides its flaws, it is more than just an advanced version of ChatGPT. In Addition to strong language-generating abilities, Bing Chat can also use a search engine, unlocking a novel way of interacting with the world wide web. We increasingly see systems that combine language models with other software. The most recent example is Toolformer, a language bot released by Meta. As everyone has noticed by now, ChatGPT is quite bad at tasks we thought of as simple from a computer's perspective. Toolformer is solving these issues. The bot can teach itself to use various external tools like search engines, calculators, and calendars. It can do so by leveraging a large language model trained to call on APIs, if necessary, to complete a given task. Suppose chatbots can leverage a variety of software. In that case, this opens up an entirely new way of human-computer interaction with the potential to transform not only web searches but knowledge work in general.
Another AI by Meta that impressed us is Cicero. Announced on November 22 last year, it has gotten less attention than it deserved due to the ChatGPT hype. Cicero is a system that can play the strategy board game Diplomacy. The difference between Diplomacy and other games AI has already mastered is that it requires extensive communication with other players through negotiation, deception, and persuasion. When Cicero played against humans, mixed with professionals and amateurs, Cicero reached the top 10%, with only one person suspecting that it was not human.
Cicero is different from other AIs because it combines game theoretic strategizing with natural language processing, developed for tasks like translation or question answering. Cicero integrates language and action in a dynamic world setting while succeeding at complex interactions with humans. In doing so, it does not only make use of machine learning but also of symbolic approaches known from the GOFAI paradigm. Over the last years of machine learning hype, many experts have predicted that machine learning will not be the final step of intelligence and that we need to integrate symbolic systems. Cicero may mark the beginning of a new era in which machine learning is not all there is to cutting-edge AI systems.
Putting the different pieces together, we can see the beginning of a new trend. Foundation models are combined with other technologies to overcome the foundation models' weaknesses and reach new capabilities. Further down the road, we expect AI assistants that can not only manage our calendar and interact with the software on our computer but also take different kinds of data into account to support our decision-making.