HOWTO display Chinese in Creative ZenMicro using Amarok

amorokid3

Creative is less creative in building software, that is why so many users are asking how to display Chinese character in Creative ZenMicro using either Windows Media Player or Creative MediaSource. But this company really build solid hardware, and I could resist the temptation of ZenMicro RED 5G (refurbished) 49$ with free shipping from Outpost. So I got another one and found the same annoyance happened to Amarok 1.4.6. Amarok is open sourced, that leave the door open for me to dig into the problem.

According to the solution,

  1. Go to your Control Panel
  2. Select Regional and Language Options
  3. Go to the Advanced tab
  4. See the Language for Non-Unicode programs
  5. Select Chinese PRC
  6. The system will prompt you to restart.

it looks like the Creative ZenMicro uses the GB2312 code page to display Chinese. So the first move to solve this problem is to convert the ID3 tag to GB2312 before the track is downloaded to the device:

// $TOPSRC/amarok/src/mediadevice/njb/track.cpp
 QTextCodec *codec = QTextCodec::codecFromName("GB2312");
 // orig: NJB_Songid_Addframe( songid, NJB_Songid_Frame_New_Title( m_bundle.title().utf8() ) );
 NJB_Songid_Addframe( songid, NJB_Songid_Frame_New_Title( codec->fromUnicode(m_bundle.title()) ) );

Good news is the old messy code disappeared, the bad news is we have new messy code. It is worthy understanding how the device stores the information and how libnjb is designed for the developers. So I sent an email to the libnjb developers, and got the reply from Linus Walleij in a short period:

The Zen Micro like all Zens use UCS2 as internal representation. However the libnjb library uses UTF-8 as input/output format if requested, so it shouldn’t matter. … … It is however very important that the client calls the libnjb function NJB_Set_Unicode() with the UTF-8 flag, in order for it to work. Otherwise, libnjb will assume encoding is in ISO8859-1 and drop a lot of information by making approximations and mangling Chinese totally. The call should be made before any other initialization.

I combed the Amarok’s source code, and found the bug, Amarok’s developers setup the Unicode after call NJB_Discover, it may be too late. Once we move that function call earlier, it works. You can download this patch here, and if you are a Gentoo user, here is the ebuild. Check out this HOWTO if you are not clear how to setup the portage overlay.

Further reading showed that the problem is caused by the transport layer from Creative which is built without Unicode support, that drops the ball, and Windows user could do nothing but request Creative to rebuild the code with Unicode; or they can use libnjb derived application, such as NomadSync(not tested yet).

Update Some Chinese characters are missing without a hint, like 一, 言, no idea which cause the bug. If you are using Creative ZenMicro, could you help to test it in Windows and leave a comment here?