Locales in Linux / Debian
I was updating one of my debian-based servers recently and installing some new software using Debian's highly wonderful 'aptitude' utility.
And I kept getting this warning message: And I was getting it an awful lot. Perl was complaining about this whilst being invoked by the debian package management system. Here's the sort of output you'd get from perl:
# perl perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = ... are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").Note: I've actually recreated this output after the fact so I'm not sure if the LANG setting shown above was what I saw originally. Typing the command
localewould give the a similar error.
locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory
It took a while to figure out the picture here. For a while I fumbled around with commands like locale, locale-def and read the man pages for locale, setlocale all the while descending further into a fog of confusion. It would be fair to say my mind grows weak at the thought of i18n, L10n and character encodings. But persist I did. Finally by some stroke of luck that I can't remember I came across the man page for
locale-genThe sun started shining again and things started to make sense between me and my Debian system once more.
So, what locale-gen says is this:
This manual page documents briefly the locale-gen command. By default, the locale package which provides the base support for localisation of libc-based programs does not contain usable localisa tion files for every supported language. This limitation has became necessary because of the substantial size of such files and the large number of languages supported by libc. As a result, Debian uses a spe cial mechanism where we prepare the actual localisation files on the target host and distribute only the templates for them. locale-gen is a program that reads the file /etc/locale.gen and invokes localedef for the chosen localisation profiles. Run locale-gen after you have modified the /etc/locale.gen file.
That was certainly more promising. I had been fumbling around with localedef
localedef --helpwhich prints the default paths used by localedef. These "default" paths were listed like this:
locale path : /usr/lib/locale:/usr/share/i18n
It wasn't terribly clear what all this meant until I started reading up on locale-gen. My /usr/lib/locale was empty or had been emptied, and /usr/share/i18n was packed to the gills with every lang/encoding template-thingumy-jig you could poke a stick at.
Next stop:
man locale.genwhich said this:
The file /etc/locale.gen lists the locales that are to be generated by the locale-gen command. Each line is of the form: <locale> <charset> where <locale> is one of the locales given in /usr/share/i18n/locales and <charset> is one of the character sets listed in /usr/share/i18n/charmaps The locale-gen command will generate all the locales, placing them in /usr/lib/locale.
First thing, I had to create /etc/locale.gen which I did with the following content:
en_AU UTF-8 en_AU ISO-8859-1I'm using AU because I'm Australian. I've include ISO-8859-1 (also known as Latin-1) because when I first did this process I only included UTF-8 and I had funny characters and line break failures in man pages. It looks like my aging version of Debian had configured groff or whatever it used for manpages to output in Latin-1 and not UTF-8. (Note: when using the POSIX setting with no locales installed - see below - I didn't get this problem or any of the above warnings for that matter).
Then I ran
locale-genwhich generated /usr/lib/locale/locale-archive . I'm not entirely sure this has solved the problem, but it seems my locale can now be set properly:
# locale LANG=en_AU.UTF-8 LC_CTYPE="en_AU.UTF-8" LC_NUMERIC="en_AU.UTF-8" LC_TIME="en_AU.UTF-8" LC_COLLATE="en_AU.UTF-8" LC_MONETARY="en_AU.UTF-8" LC_MESSAGES="en_AU.UTF-8" LC_PAPER="en_AU.UTF-8" LC_NAME="en_AU.UTF-8" LC_ADDRESS="en_AU.UTF-8" LC_TELEPHONE="en_AU.UTF-8" LC_MEASUREMENT="en_AU.UTF-8" LC_IDENTIFICATION="en_AU.UTF-8" LC_ALL=
It seems that the locale error messages could have been removed by a much shorter route.
export LC_ALL=POSIXI also noticed that unsetting something like LANG which might have been set to a locale not installed in /usr/lib/locale also cleared up error messages.
unset LANGIn both the above cases, "POSIX" is the value used for all LC and LANG settings. Despite this, it still seems to make sense to me, to install both utf-8 and latin-1 locales explicitly.
By the way, if you ever have annoying control characters in manpages and don't have the time to straighten the system out, try this:
man some_man_page | col -b | less