¿Voy a tener que prohibir el uso de ChatGPT a mi equipo?

Al contrario. Moviwa nace para que no tengas que prohibir nada. Tus empleados siguen usando ChatGPT, Copilot o cualquier IA generativa con total normalidad. Moviwa actúa como un proxy invisible que filtra los datos sensibles antes de que salgan de tu empresa.

¿Es difícil de instalar?

Para nada. Moviwa es una capa invisible que se despliega en minutos. No requiere cambios en la infraestructura existente ni formación especial para los empleados. Funciona desde el primer día.

¿Mis empleados se sentirán vigilados?

No. Moviwa detecta patrones de datos sensibles de forma automática y los enmascara antes de que lleguen a la IA. No monitoriza conversaciones ni registra el contenido de las consultas. Tu equipo trabaja con libertad y tú duermes tranquilo.

¿Cumplís con la EU AI Act?

Sí. Moviwa está diseñado específicamente para ayudar a las empresas a cumplir con la EU AI Act y el RGPD. Proporcionamos trazabilidad, control de datos y las garantías que los reguladores exigen.

Desde 15 €/empleado/mes. El precio varía según el número de usuarios y las funcionalidades que necesites. Contacta con nosotros para un presupuesto personalizado.

ALIA's Tokenizer: Why Spanish, Catalan, Basque, and Galician Cost Almost Half the Tokens of Llama 3

May 4, 2026·erobles·Artificial Intelligence

What you don't see when using an LLM: it doesn't read your text, it reads tokens

When you ask ChatGPT, Claude, or your local LLM something, the model never sees the sentence you wrote. The first thing that happens is that a component called the tokenizer splits your text into pieces — tokens — and the model only receives numbers. All inference (and all cost) operates on those tokens, not on words.

The consequence, which almost nobody discusses publicly: two models can consume vastly different amounts of tokens to process exactly the same text. And models trained with English in mind are much less efficient with other languages.

A few weeks ago we published ALIA quantized to NVFP4 so anyone could run it on an NVIDIA DGX Spark. While researching that, we discovered something that deserves its own article: ALIA's tokenizer is radically different from Llama 3, Mistral, or GPT, and for Iberian languages it's between 1.7 and 2 times more efficient.

Since a paragraph won't convince anyone, we built a tool so you can see for yourself:

🌐 Try it live: labs.montevive.ai/alia-tokenizer-comparison/ — live comparison between ALIA, Llama 3, and Mistral tokenizers. Paste any text and watch in real time how each model chunks it. Runs 100% in your browser: no text leaves your device.

🎥 Video demo (3 min): — a visual tour of the demo with examples in Spanish, Catalan, Basque, and Galician.

The numbers, in a table

We took an administrative paragraph in each of the four Iberian languages and ran it through all three tokenizers. The results are striking:

Language	ALIA	Llama 3	Mistral	ALIA vs Llama 3
Spanish (legal-administrative text)	31 tokens	53	67	1.71× more efficient
Catalan (institutional text)	34 tokens	62	78	1.82× more efficient
Basque (public services)	42 tokens	81	102	1.93× more efficient
Galician (local administration)	38 tokens	65	80	1.71× more efficient
English (equivalent text)	47 tokens	41	50	0.87× (Llama 3 wins)

The difference is most noticeable in administrative and territory-specific vocabulary — exactly the use case that matters most to a Spanish public administration:

Generalitat → 1 token in ALIA, 3 tokens in Llama 3 (Gen, eral, itat)
ayuntamiento → 1 token in ALIA, 3 tokens in Llama 3 (ay, untami, ento)
Cataluña → 1 token in ALIA, 3 tokens in Llama 3
Euskadi → 1 token in ALIA, 4 tokens in Llama 3
Xunta → 1 token in ALIA, 2 tokens in Llama 3
concejalía → 1 token in ALIA, 4 tokens in Llama 3

ALIA recognizes these pieces as atomic units. Llama 3 breaks them into meaningless fragments.

Why does this matter? Four concrete reasons

1. Cost

The price of any LLM API is measured in tokens, not words. If your RAG processes 10 million words per month in Spanish, with an Anglo-Saxon tokenizer you pay 70-90% more than you would with a tokenizer like ALIA's. The difference compounds: prompt + retrieved context + response, everything counts.

For a RAG system over the BOE (Spanish Official Gazette), municipal files, or healthcare documentation — where a single prompt can carry several thousand words of context — the annual bill changes by an order of magnitude.

2. Speed

An LLM's generation speed is measured in tokens per second, not words per second. If your model generates at 50 tok/s, and your tokenizer needs 1.7× more tokens for the same paragraph, your user perceives 1.7× less speed. For a conversational assistant in Spanish, that factor is the difference between "instant" and "noticeable latency."

3. Context window

All LLMs have a context limit measured in tokens. With an efficient tokenizer, the same window fits almost twice as much Spanish text:

Llama 3 with 8,192 tokens of context ≈ ~6,000 words of Spanish
ALIA with 8,192 tokens of context ≈ ~10,000 words of Spanish

For RAG over long documents (court rulings, administrative files, medical records) that's the difference between "I have to split the document into five pieces" and "it fits whole."

4. Quality

This is the most subtle but perhaps most important consequence. When a model chunks ayuntamiento into ay + untami + ento, its attention mechanism has to reconstruct the meaning from individually meaningless fragments. Each token "sees" the rest of the sentence worse, and the model spends capacity re-assembling words before it can even begin reasoning about them.

When ALIA sees ayuntamiento as a single token, that token already carries the complete semantics of the word from the start. The quality of Spanish responses improves — not because the model is better in abstract, but because the input is cleaner.

Why ALIA is like this: a custom-built vocabulary from scratch

Most current open-source models inherit their tokenizer from Llama (128,000 tokens, tiktoken-style BPE) or from Mistral (32,000 tokens, SentencePiece). Those vocabularies were trained on English-dominated corpora. Words in Spanish, Catalan, Basque, or Galician weren't sufficiently represented to achieve efficient encoding, so they appear fragmented.

ALIA does something different: the Barcelona Supercomputing Center team trained a SentencePiece tokenizer from scratch on a multilingual Iberian corpus, with a vocabulary of 256,000 tokens — double that of Llama 3 and eight times that of Mistral. That extra size is spent on pieces useful for Iberian languages: institution names, legal-administrative vocabulary, Catalan verbal morphology, Basque suffixes, Galician roots.

The result is what the demo shows: each "natural" word from Iberian administration is encoded as a single piece, not as a puzzle of fragments.

What this means for your project

If you work with text in Spanish, Catalan, Basque, or Galician — and especially if you work with Spanish administrative, legal, or sector-specific text — using an LLM with an Anglo-Saxon tokenizer is a silent efficiency loss that shows up in billing, latency, and response quality.

ALIA in NVFP4 gives you both things at once:

The Iberian tokenizer: ~1.7× fewer tokens for the same text in Iberian languages
A 40B parameter model trained on data representative of Iberian culture, now executable and adaptable on a €4,000 NVIDIA DGX Spark thanks to NVFP4 quantization

If you want to understand how to deploy and adapt ALIA to your organization's domain, we cover it in this other article.

Try it right now

The demo is at labs.montevive.ai/alia-tokenizer-comparison/. Paste a paragraph from your local official gazette, a Constitutional Court ruling, a company circular. You'll see all three tokenizers working in parallel, with token counts, efficiency ratios, and colored chips for each piece.

And it all runs 100% in your browser, with transformers.js: no text you paste leaves your device, not even for tokenization. Consistent with how we build things at Montevive.

More demos in the lab

This is the second demo at labs.montevive.ai. The first, also local-first and private, detects personally identifiable information (PII) in the text you paste, without servers and without sending data anywhere:

Privacy Filter local — PII detector in the browser, based on a multilingual NER model running with WebGPU
ALIA Tokenizer Comparison — the one from this article

Same principle in both cases: what can be done locally without losing quality, should be done locally.

About ALIA: collaboration between BSC and Spanish research centers

ALIA is coordinated by the Barcelona Supercomputing Center (BSC-CNS), under the leadership of the Secretary of State for Digitalization and Artificial Intelligence (SEDIA) and driven by the Government of Spain.

The project builds upon ILENIA (Impulse of Languages in Artificial Intelligence), a consortium that integrates research centers specialized in language technologies for each co-official language:

Participating centers

BSC-CNS (Barcelona Supercomputing Center): general coordinator and responsible for the AINA project for Catalan
HiTZ (Basque Center for Language Technology) - University of the Basque Country: GAITU project for Basque
CiTIUS (Research Center for Intelligent Technologies) and ILG (Galician Language Institute) - University of Santiago de Compostela: NÓS project for Galician
CeAtic (Center for Advanced Studies in ICT) - University of Jaén: ALIA technology transfer in Andalusia
CENID (Digital Intelligence Center) - University of Alicante: VIVES project for Valencian

Funding: Recovery, Transformation and Resilience Plan (NextGeneration EU), EuroHPC Joint Undertaking (European supercomputing consortium) and regional governments.

ALIA represents the convergence of these previous efforts into a unique, multilingual and sovereign infrastructure, trained on MareNostrum 5 (BSC, Barcelona).

Want to adapt ALIA to your organization?

At Montevive we help public administrations, regulated companies, and cooperatives deploy generative AI within their own infrastructure, keeping data in-house:

Domain fine-tuning on ALIA, adapted to your terminology (legal, healthcare, sector-specific)
Deployment on NVIDIA DGX Spark, Blackwell servers, or private cloud
Integration with your existing systems: APIs, RAG, specialized agents

📧 Contact: info@montevive.ai
🌐 More information: montevive.ai

ALIA belongs to everyone. And its tokenizer, moreover, is for us.

Back to blog

/ Blog /