The TrustLLM project will develop European large language models (LLMs) on an unprecedented scale, trained on the largest amount of text so far in European AI, covering a range of underrepresented languages, and pushing the limits of European exascale computing.
The main objective is the development of an open, trustworthy, and sustainable LLM initially targeting the Germanic languages. This will create the foundation for an advanced open ecosystem for next generation modular and extensible European trustworthy, sustainable, and democratized large language models. The TrustLLM project and the surrounding ecosystem will enable, support, and improve context-aware human-machine interaction in a wide range of applications.
To achieve this, TrustLLM will tackle the full range of challenges of LLM development, from ensuring sufficient quality and quantity of multilingual training data, to sustainable efficiency and effectiveness of model training, to enhancements and refinements for factual correctness, transparency, and trustworthiness, to a suite of holistic evaluation benchmarks validating the multi-dimensional objectives.
The TrustLLM consortium has unique expertise and practical experience in building LLMs, combined with leading NLP researchers as well as organizations working on transfering the technology to companies and end-users.
The models developed will be the most powerful and trustworthy LLMs in Europe, and they will constitute a major breakthrough in AI that will establish a new foundation for the next generation of large-scale European AI models. Our focus on Germanic languages can serve as a blueprint for future activities in other families of languages. This will help secure Europe’s sovereignty with respect to crucial AI technologies, establishing a novel framework for European collaboration on LLMs, and creating the foundation for a pan-European center for LLMs and large-scale AI to maximise the scientific, social, and economical impact.