Sunday, April 19, 2009

My struggle with the japanese keyboard

Untill now I had two "independent" language settings in my Debian system: a default of  pt-BR (using the iso-8859-1 encoding) and when launching a kterm terminal all locales were set to ja-JP (for using iso-2022-jp and euc-JP encodings). For typing in portuguese (or spanish, french, etc.) I had a 'compose key' AKA 'Multi_key' such that "ç", for instance, could be entered as <,> + . This is in opposition to the (more common?)  'dead key' approach where some keys (like <'> and <~>) wait for the next key before they get printed. To write in japanese I used the kinput2 program, that was activated when I pressed the 'Kanji' key (to write in japanese/chinese/korean in general you need a program that interprets what you type and maps the keystrokes to compatible symbols). I mapped these 'Kanji' and 'Multi_key' to unused keys on my keyboard (like the Windows menus...) using xmodmap.

This inconvenience is because most e-mail in Japan follows iso-2022-jp and documents are in euc-JP. I avoid creating text files (including code and latex) that don't fit into a one bite encoding, but I may have myself other messy files. Recently I was borged migrated to google mail, bought a new machine, upgraded KDE, and started using Mac OS X (which is UTF-8). And I thought that this was a good opportunity to change my Debian to UTF-8 as well - one friend told me that he had no problem with japanese/french on Ubuntu. So I'll describe here the many ways that I failed.

My first problem was when using KDE to set the keyboard layout (System Settings -> Regional & Language -> Keyboard Layout): my jp-106 keyboard was configured just fine by the command setxkbmap, but I wasn't able to customize other keys with xmodmap. The problem was that xmodmap was being called before startkde on .xsession, and setxkbmap (called from startkde) overwrited the xmodmap settings. There should be many ways of solving it, but what I did was to use the setxkbmap settings to create a customized xmodmap configuration (xmodmap -pke >> ~/.xmodmap to dump the keycodes). This '.xmodmap' file can then be modified (I simply mapped the exoteric 'Henkan_Mode' key to 'Multi_key') and the KDE keyboard layout unchecked to prevent loading the setxkbmap. With this I have a working "european" keyboard. 

Now, the japanese input. I migrated from kinput2 to scim (skim in KDE, to be more precise) since it allows you to internationalize your keyboard "on the fly". I have two japanese dictionaries for scim, scim-anthy and scim-canna (with kinput2 I used the canna dictionary), and my '.xsession' file is as follows:

#!/bin/bash --login

xmodmap /home/leo/.xmodmaprc

export XMODIFIERS="@im=SCIM"
export GTK_IM_MODULE="scim"
export QT_IM_MODULE="scim"
export XIM_PROGRAM="scim"
export XIM="SCIM"
export XIM_ARGS="-d"


In the skim settings I defined one key ('Zenkaku_Hankaku') to toggle scim input on/off. Some applications like iceweasel and gnome-terminal (GTK apps) worked without problem for japanese and "european" input: I use 'Multi_key' to type european accents and scim to switch to japanese mode. But other applications like konsole and openoffice (Qt-based apps) were unable to recognize the 'Multi_key' signal. The problem here is that I should check other -> English/European  in the skim setup, even though I don't need to activate scim to type european accents.

I ran into other troubles like trying to set up hotkeys for the different dictionaries (activate scim and jump to a specific dictionary in one stroke): for some mystical reason I could not revert the changes, and the situation became worst when I mistakenly associated 'Multi_key' to the european dictionary. The solution to this was to remove the lines corresponding to the hotkeys definitions from the file  '~/.kde/share/config/skimrc'.

I ommited the details of installation (like running im-switch and scim-setup) but I believe my testimony takes account of some common mistakes. I may update this post later if I remember anything else - or I find out that my problems were just beginning...

Update 2009.08.06

When installing Debian GNU/Linux in a new machine I had the same problem with QT-based applications (like konsole, skype and openoffice), where special characters (like "ç" or "日本") would appear as messed up ASCII characters. Even with the skim setup fix mentioned above, I had no luck. I've got it working after doing the following steps:

  • updated the xinput method alternatives through the command im-switch -z all_ALL -c (and by choosing skim within the update-alternatives command). I believe that the command im-switch -z all_ALL -s skim would have the same effect;
  • told skim to explicitly support the locales ja_JP.UTF-8 and pt_BR.UTF-8 (besides my global en_US.UTF-8), going to the skim configuration option General SCIM->Other->Advanced->Support Unicode Locales (Edit). This is equivalent to including the line /SupportedUnicodeLocales = en_US.UTF-8, ja_JP.UTF-8, pt_BR.UTF-8 to the file ~/.scim/global

Update 2011.01.16:

It seems that skim/scim development is discontinued. The new "standard" input method is ibus. Fairly similar to scim:
  • in the .xsession file replace "scim" for "ibus";
  • be sure to install "ibus-gtk" and "ibus-qt4";
  • run the daemon when the X session starts as "/usr/bin/ibus-daemon -d -x" (you can put it in ~/.kde/Autostart/ or within ~/.session)

No comments:

Post a Comment

Before writing, please read carefully my policy for comments. Some comments may be deleted.

Please do not include links to commercial or unrelated sites in your comment or signature, or I'll flag it as SPAM.


Related Posts with Thumbnails