Luca's meaningless thoughts   SponsorGitHub SponsorsLiberapayPaypalBuy Me A CoffeePatreonFlattr

The LANGUAGE variable is broken for English as main language

by Leandro Lucarella on 2020- 11- 18 11:38 (updated on 2020- 11- 18 11:38)
tagged en, gettext, lang, language, linux - with 0 comment(s)

The LANGUAGE environment variable can accept multiple fallback languages (at least if your commands are using gettext), so if your main LANG is, say, es, but you also speak fr, then you can use LANGUAGE=es:fr.

But what happens when you main LANG is en, so for example your LANGUAGE looks like en:es:de? You'll notice some message that used to be in perfect English before using the multi-language fallback now seem to be shown randomly in es or de.

Well, it is not random. The thing is, since English tends to be the de-facto language for the original strings in a program, it looks like almost nobody provides an en translation, so when fallback is active, almost no programs will show messages in English.

For example, this is my Debian testing system with roughly 3.5K packages installed:

$ dpkg -l |wc -l
3522
$ ls /usr/share/locale/en/LC_MESSAGES/ | wc -l
12

Only 12 packages have a plain English locale. en_GB does a bit better:

$ ls /usr/share/locale/en_GB/LC_MESSAGES/ | wc -l
732

732 packages. This is still lower than both en and de:

$ ls /usr/share/locale/es/LC_MESSAGES/ | wc -l
821
$ ls /usr/share/locale/de/LC_MESSAGES/ | wc -l
820

The weird thing is packages as basic as psmisc (providing, for example, killall) and coreutils (providing, for example, ls) don't have an en locale, and psmisc doesn't provide es. This is why at some point it seemed like a random locale was being used. I had something like LANGUAGE=en_GB:en_US:en:es:de and I use KDE as my desktop environment. KDE seems to be correctly translated to en_GB, so I was seeing most of my desktop in English as expected, but when using killall, I got errors in German, and when using ls, I got errors in Spanish.

If you don't provide other fallback languages, gettext will automatically fall back to the C locale, which is the original strings embedded in the source code, which are usually in English, and this is why if you don't provide fallback languages (other than English at least), all will work in English as expected. Of course if you use C in your fallback languages, before any non-English language, then they will be ignored as the C locale should always be present, so that's not an option.

I find it very curious that this issue has almost zero visibility. At least my searches for the issue didn't throw any useful results. I had to figure it all out by myself like in the good old pre-stackoverflow times...

Note

I know is not a typical use case, as since almost all software use English for the C locale it hardly makes any sense to use fallback languages in practice if your main language is English. But theoretically it could happen, and providing an en translation is trivial.