Apertus is Switzerland’s answer to closed-box large language models. It’s an open-weight, multilingual LLM released by EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS). It’s designed for transparency, research reuse, and EU-grade compliance – not really to beat proprietary models on every leaderboard overnight. If you build or govern AI, Apertus changes a few practical assumptions about model auditability, data provenance and regulatory fit.1
Quick facts
- Developers – EPFL, ETH Zürich, CSCS (‘Swiss AI Initiative’),
- Release date – 2 September 2025 (public launch, downloads via Hugging Face and partners),2
- Model family – two sizes, 8B and 70B parameters (open weights + checkpoints),
- Training data – ~15 trillion tokens, >1,000 languages (team emphasises strong non-English coverage, other reports list up to circa 1,811 languages). 40% of tokens are non-English,3
- Openness – architecture, training recipes, intermediate checkpoints, dataset documentation and weights published under an open license. Intended for research, education and commercial use,
- Access – download from Hugging Face, inference via Swisscom, Public AI, and community runtimes (Transformers, vLLM, MLX).
Why does Apertus matter?
What Apertus can and can’t do today?
- Can – run research experiments, fine-tune for domain tasks, support multilingual apps, be inspected and forked by organisations who need full audit trails.
- Can’t (yet) – reliably out-perform top closed models on every benchmark or replace well-resourced, production-grade proprietary services for every enterprise use case. The Swiss team emphasises openness and trustworthiness over chasing a weekly leaderboard lead.5
How does Apertus compare to other LLMs?
Model | Openness (weights) | Representative sizes | Training tokens (public) | Strong suit |
---|---|---|---|---|
Apertus | Open (weights + recipes + checkpoints) | 8B, 70B | ~15T (public, ~40% non-English) | Transparency, multilingual breadth, regulatory fit for EU. |
Llama 3 (Meta) | Open (released by Meta) | 8B, 70B, 405B | ~15T (Meta reports) | Cost-efficient open alternative; broad community adoption. |
Mixtral / Mistral family | Mix of open models + research/commercial licences | 7–123B / mixtures (Mixtral) | Various | Strong efficiency and European language coverage for some variants. |
GPT-4.x | Proprietary | Proprietary (estimated 1.8 trillion) | Not public (very large) | Top raw performance on many benchmarks, advanced multimodal features. |
Claude | Proprietary | Proprietary (estimated 300 – 500B) | Not public | Safety-focused, large context windows for enterprise. |
Note that parameter counts and token counts are those publicly stated by the model providers or by reputable coverage. Benchmarks vary by dataset and tuning. Use the model cards on Hugging Face or vendor docs for exact, current specs before production decisions.
Benchmarks – what to look for and what’s public so far?
Which benchmarks matter
MMLU (general knowledge / multi-task), multilingual evaluations (XTREME, MMLU-ProX), long-context tests, and specialised safety or adversarial suites. These measure different abilities like factual knowledge, cross-lingual transfer, reasoning and robustness.11
Apertus status
Official press materials and news reports say that Apertus performs well on multilingual benchmarks and that community and third-party leaderboards and lab evaluations are arriving. Independent community benchmarks and lab pages are already beginning to publish Apertus runs. Expect more complete MMLU / multilingual leaderboards in the next weeks as researchers finish evaluations.12
Apertus is built for the public good. It stands among the few fully open LLMs at this scale.
Imanol Schlag, ETH Zurich
Further context
By 2024/2025 leading proprietary models like GPT-4 variants, Claude, Llama 3 were scoring in the high 70s–90s on MMLU variants depending on model and setting. MMLU remains a coarse but useful indicator. For most comprehensive results it is recommended to use multiple benchmarks, including multilingual ones, to avoid over-relying on a single score.13
Apertus and the EU AI Act
Transparency obligations
The EU AI Act places explicit disclosure and documentation requirements on foundation/general-purpose models (model cards, data provenance, summaries of training datasets and risk assessments). Apertus’ public release of weights, recipes and dataset filters directly supports those obligations making it easier for providers and deployers to prepare the documentation that Article 13/50 requires.
Open-source nuance
The AI Act treats open-source projects differently in some cases, but foundation models with systemic risk still face obligations. Apertus’ designers state they built the model with the Act’s transparency expectations in mind. That can lower compliance friction for EU/EEA deployments vs closed models where provenance is opaque.14
Operational takeaway for you
If you need to deploy an LLM in the EU/EEA or supply LLM-based services to EU customers, Apertus reduces documentary and audit risk because you can inspect and supply the provenance documentation regulators will expect. Obviously you are still required to run risk assessments, adversarial testing and incident reporting as required.15
Why should we use AI models built in Europe?
Using models like Apertus, developed in Europe, gives you strategic advantages – they are designed with European values of transparency, privacy, and regulatory compliance at their core. Unlike many overseas models, Apertus comes with full, comprehensive documentation, open weights, and multilingual coverage that includes underrepresented European languages such as Romansh and Swiss German. This alignment with the EU AI Act reduces compliance risk, strengthens digital sovereignty, and ensures that critical AI infrastructure is not entirely dependent on non-European providers. For businesses and governments, adopting European-built LLMs means more control, trust, and likely long-term resilience in a rapidly evolving AI landscape.
Explore further and join the conversation
If you are interested in getting to know this topic more I recommend you to read what I have included in sources below.
I encourage you to leave a reply comment under this article. Stay tuned!
Sources
- ETH, “Apertus: a fully open, transparent, multilingual language model” ↩︎
- Swiss AI, “Apertus” ↩︎
- itsfoss, “Switzerland Launches Apertus: One of Europe’s Largest Open Source AI Models” ↩︎
- EC, “AI Act” ↩︎
- SWI, “Switzerland launches transparent ChatGPT alternative” ↩︎
- AWS, “Announcing Llama 3.1 405B, 70B, and 8B models from Meta in Amazon Bedrock” ↩︎
- Meta, “Introducing Meta Llama 3: The most capable openly available LLM to date” ↩︎
- Mistral, “Mistral model weights” ↩︎
- Datacamp, “What is MMLU? LLM Benchmark Explained and Why It Matters” ↩︎
- Hugging face, “META LLAMA 3” ↩︎
- arxiv, “XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization” ↩︎
- DS-NLP Lab, “LLM BENCHMARK EVALUATION – APERTUS-8B” ↩︎
- Stanford university, “Massive Multitask Language Understanding (MMLU) on HELM” ↩︎
- IBA, “The regulation of foundation models in the EU AI Act” ↩︎
- Reuters, “AI models with systemic risks given pointers on how to comply with EU AI rules” ↩︎