I once mentioned that some Chinese characters are missing in Creative ZenMirco when using Amarok + libnjb. I checked the libnjb-2.4.4 source code and found that the text codec conversion from UTF-8/ISO8859-1 to UCS2 big-endian is home-brew instead of the standard libiconv, maybe the libiconv is overkill since only three codecs are really needed. According to the specification of UTF-8

U-00000000 – U-0000007F: 0xxxxxxx
U-00000080 – U-000007FF: 110xxxxx 10xxxxxx
U-00000800 – U-0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
U-00010000 – U-001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
U-00200000 – U-03FFFFFF: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx
10xxxxxx
U-04000000 – U-7FFFFFFF: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx
10xxxxxx 10xxxxxx

while the libnjb developers made a small mistake, so all 0×80xx characters are categorized as abnormal.

if (numbytes == 2 && str[i+1] > 0×80) {
                        … …
                } else if (numbytes == 3 && str[i+1] > 0×80 && str[i+2] > 0×80) {
                        … …
                } else {
                  /* Abnormal string character, just skip */

Here is a patch against libnjb-2.2.4, you may adapt it to libnjb-2.2.5 as well. For Gentoo users’ convenience, here is the ebuild.

I am just curious why this bug has been here for such a long time. How many Chinese Linux/BSD users take UTF-8 as the locale, transfer music to Creative ZenMicro? Are these three factors are really small that make this almost never gonna happen?


2 Comments to “Small fix for libnjb to transfer Chinese tags”

  1. rpttfouuja | July 31st, 2007 at 6:04 pm

    Hello! Good Site! Thanks you! igijaobrmtiqjt

  2. MOBY | August 9th, 2007 at 7:21 pm

    Bookmarks…

    I can’t add your post to Digg. How I do this?…

Leave a Comment