Discussion:
UTF-8 locale support
Dima Veselov
2014-08-14 23:44:17 UTC
Permalink
Hello!

My servers keep many files in cyrillic naming. Serving big user loads
it is hard to keep files in old encodings with outside world is already
living in UTF-8. Storing files not in UTF-8 cause some problems with Samba
and fatal problems with Linux & NFS, which don't have conversions at all.

So I feel there is time to move on UTF-8 on NetBSD too, and it seems
NetBSD 6 has ru_RU.UTF-8 support, however it is still is not complete.

Fresh installed 6.1.4 can store files in UTF-8. It also can share these
via SMB or NFS, but I can't make it work in shell.

As I see it has support only for LC_CTYPE and LC_MESSAGES via locale.alias
having no native ru_RU.UTF-8 support.

My linux rxvt-unicode terminal (working locally as expected) with ssh to
NetBSD box show:

[***@gloria ~]$ locale
LANG="ru_RU.UTF-8"
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="ru_RU.UTF-8

This cause cyrillic filenamse being shown good, but I cannot access it,
because shell print hex code (f.e. \:\262\321\320) instead of letters.
Bash is 4.3.0(1) out of the box. (By the way https://wiki.netbsd.org/unicode/
says it will work out of the box)

Two questions on that:

Am I right and aliasing ru_RU.UTF-8 to en_US.UTF-8 make this that bad?

If I am right - what I shall do to complete ru_RU.UTF-8 locale and have
no problems in writing cyrillic filenames?
--
Sincerelly yours
Sergey Hromov
2014-08-15 08:05:08 UTC
Permalink
В Fri, 15 Aug 2014 03:44:17 +0400
Post by Dima Veselov
My linux rxvt-unicode terminal (working locally as expected) with ssh
LANG="ru_RU.UTF-8"
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="ru_RU.UTF-8
This cause cyrillic filenamse being shown good, but I cannot access
it, because shell print hex code (f.e. \:\262\321\320) instead of
letters. Bash is 4.3.0(1) out of the box. (By the way
https://wiki.netbsd.org/unicode/ says it will work out of the box)
Am I right and aliasing ru_RU.UTF-8 to en_US.UTF-8 make this that bad?
If I am right - what I shall do to complete ru_RU.UTF-8 locale and
have no problems in writing cyrillic filenames?
Dont' tried rxvt, but using xfce4-terminal I can operate with files on
NetBSD file server (with LC_ALL=ru_RU.utf-8 and ksh as shell
for user on it) using ssh. But with this combination there is some
issues, for example if I

$ touch тест

$ mv тест тес2 (here used tab for autocompletion and
backspace for deleting last letter "т")

$ ls
тес?2

Looks like backspace can't handle two-byte coded symbols.

Loading...