LEAM: Bracing for change
The rise of Large Models, we discussed in the previous piece of this article series, comes along with opportunities and challenges alike. We are expecting shifts in the power dynamics of whole industries and a changing role of AI research. For Europe to not get left out, it is crucial to level up our capabilities to build Large AI Models and make them available to research and industries. To reach this objective, is the goal of the Large European AI Models (LEAM) initiative that is driven by leading European players in AI. LEAM represents the opportunity to create Large Models, made in Europe, that are not only for the benefit of Big Tech, but which can provide value to all stakeholders. In the following, we will outline why Europe needs to face the changes in the AI field.
The Societal Impacts of Large Models
So far, Large Models have not yet been deployed in great numbers nor for many use cases. But due to their astonishing capabilities, of which we still might discover more, it is only a matter of time until applications, built on Large Models, will become the new standard. Google search already depends on the Large Model BERT. The social impact, Large Models are going to have, is expected to be tremendous. It can lead to massive opportunities but also comes with certain risks. We need to make sure that the deployment of Large Models leads in the right directions. We need to make sure that we can preserve an open research environment, prevent too much centralization of power dynamics and make sure that nobody gets excluded from the benefits of Large Models. The key to reaching these goals is the accessibility of Large Models.
We mentioned in the previous article that it is not cheap to train Large Models. GPT-3 allegedly cost 12 million dollars to train, using almost 10,000 days of compute time. Since GPT-3 is by far not the largest of models it is clear that not only SMEs will struggle to keep up with the costs, but also large companies and universities. So far, it has already been difficult for small players to overcome the constraints of talent and data. However they were able to overcome these challenges if they could rely on a strong ecosystem. For Large Models it is different. The amounts of data and talent required to build large models are much higher compared to what we are used to from conventional models. On top of that come the costs of compute.
Generally it is not a surprise that large companies have more resources and are better at building things on a large scale. Large Models are different though. The performance of Large Models scales as a power-law with model size, data set size and the amount of compute used for training. This means that the largest models can outperform smaller models significantly. Moreover, the homogenization of models leads to many smaller models being outperformed in various domains by just one large model.
So far, we mainly see proprietary access, controlled by a few US-American and Chinese players. Projects such as OPT and BigScience are about to change that. But still, we need to put more effort into open access large models. Otherwise, anyone who wants to deploy applications, based on the existing Large Models, or wants to do research on foundation models, will be dependent on a few players. This is a worrisome perspective because large models and their social implications are not well understood yet. We need more research, not only from computer scientists but also from social scientists, legal scholars, etc.. Much of the necessary research is not likely to happen, if we lack large models that are accessible to a wider research community.
Moreover, if the current trajectory continues, we can expect a massive power shift due to the deployment of Large Models. The scaling laws for the performance of Large Models suggest that we could see a winner-takes-it-all situation, where a monopoly or oligopoly controls the largest models and can outperform not only conventional models but also smaller Large Models in many respects. Once these monopolistic structures are implemented, they will get further reinforced, because the usage of the models will feed them with further data and hence increase their capabilities. Additionally, from a European perspective it is not clear at all whether Large Models will be tailored to our needs. There are, for example, no incentives for US-American companies to build applications for less comon European languages.
The analysis above suggests that there are two problems that need to get solved to set the stage for the transformation through Large Models. First of all, we need broader access to Large AI Models in order to prevent monopolization and to allow open research. So far, only big companies have the resources to build Large Models of a certain size. But corporate research alone is not addressing all questions that are relevant for society. Secondly, we need a European version of Large Models. We need these models, made in Europe, not only to stay competitive and allow research in Europe, but also to enable solutions that match European needs and standards.
Anticipating these challenges, the German AI Association started the LEAM initiative in May 2021. The initiative aims to build Large Models in Europe that are open access and allow all stakeholders to participate in a competitive AI environment. To realize this project, the German AI Association, together with leading European players, will provide the Infrastructure to build the next generation of AI models in Europe.