LEAM: How Large Models will enable Businesses
Having looked at large models and their social implications before, we will now discuss their possible applications. As we pointed out in previous articles, we mostly talk about language models when it comes to the current large models. And, although we have not yet seen large scale deployment of these models, they increasingly find their way in business practice. In this article we will zoom in on the technology and business applications of GPT-3, since it is the most prominent large model and has been deployed the most. The model was introduced by OpenAI in 2020 and has started the run for large models.
Technology behind GPT-3
As the name “GPT-3” suggests, we are talking about the third generation of the GPT series, with GPT-3 being more than a hundred times larger than its predecessor, GPT-2. “GPT” stands for “Generative Pre-trained Transformer”, indicating the Technology behind the model.
A Generative Model is a model that, after it gets trained on a data set, is able to generate new data instances based on the training data. This is possible, because the model has learned general features from the training data and can therefore produce similar new data. If a generative model is given a data sample it can estimate what is likely to come next and generate the according data. In the case of GPT-3, the model is trained on language data from the internet - almost 500 billion words - and is able to generate text of various types. This generative approach stands in contrast to the discriminative approach. A discriminative model is trained employing supervised learning on labeled data which enables the model to predict whether input data belongs to a category induced by the labels of the training data. These models can be used for visual recognition, and can, for example, be employed to detect military units on satellite images or detect breast cancer on scans.
A Pre-trained model, like GPT-3, gets initially trained with a large amount of general data. Afterwards the model can be adapted to various downstream tasks through a second learning phase during which the model gets adapted to a more specific task. Due to the pre-training, the model is able to learn this task with significantly fewer data points. In recent years, pre-trained models have become more widely used, since they are more generalizable than models which got trained only once, and they are less likely to overfit if trained on enough data.
What are the possible Applications?
In June 2020, OpenAI released an API that allowed developers to build software, based on GPT-3. The API provides a “text in, text out” interface that can be used to generate samples of English language text. By now, there are many applications of GPT-3 out there. The Guardian, for instance, has published an essay written completely by a program based on GPT-3. Further interesting applications are a code oracle, that can explain in English what a certain piece of code does, a web app that can produce charts from plain English or Software that automatically writes summaries from customer feedback. For customer services, overall, there is big potential for improvement through automating responses to queries and legal tasks. There are many applications out there. Most of them evolve around simplifying or automating tasks, involving the use of language, that took a lot of time so far, despite being not very complicated.
By March 2021, more than 300 applications were using GPT-3 and tens of thousands of developers were building on OpenAI’s platform. Moreover, in 2020 Microsoft struck a deal with OpenAI and acquired an exclusive license to use GPT-3 for their products. In May 2021, Microsoft announced their first commercial use case for GPT-3, generating Microsoft Power PX formulas from natural language input.