Bite by MemoryError

Gentoo February 27th, 2008

In the last post, I worked through the installation and deployment for Django application. And I setup a MoinMoin Wiki as well. Everything works fine until I switched back and forth between the blog and wiki, I got a MemoryError like this:

Mod_python error: “PythonHandler django.core.handlers.modpython”
Traceback (most recent call last):

File “/usr/lib/python2.5/mod_python/apache.py”, line 299, in HandlerDispatch
result = object(req)

File “/usr/lib/python2.5/site-packages/django/core/handlers/modpython.py”, line 188, in handler
return ModPythonHandler()(req)

MemoryError

It is most likely that the two application’s memory footprints exceed the limit. Yes, FastCGI would rescue; unfortunately, it is another advanced feature you have to pay for. There is no explicitly regulation when I signed up but I think it is quite reasonable to set the resource quota as virtual server is essentially shared hosting shared anyway. I would contact the custom service of Jumpline tomorrow for a better understanding.

Update: just contact the customer service of Jumpline, they seen have no QoS of the memory footprint or CPU load. I really have no idea whether this good news or bad news.

Yet another locale problem

Gentoo January 27th, 2008

In the last post, we manage to leverage eye3D for standardizing the ID3 tag. But we still got messy code when manually manipulating the tags in the command line. It may result in either wrong arguments or encoding bug.

Further investigation focused on eyeD3’s __init__.py:

LOCAL_ENCODING = locale.getpreferredencoding(do_setlocale=False);
if not LOCAL_ENCODING or LOCAL_ENCODING == “ANSI_X3.4-1968″:
    LOCAL_ENCODING = ‘latin1′;

Either LOCAL_ENCODING is None or mysterious ANSI_X3.4-1968, the encoding is assumed as latin1. In my Gentoo box, with do_setlocale set False, getpreferredencoding just returns ANSI_X3.4-1968 though the locale is en_US.UTF-8 instead.

According to the documentation:

On some systems, it is necessary to invoke setlocale to obtain the user preferences, so this function is not thread-safe. If invoking setlocale is not necessary or desired, do_setlocale should be set to False.

I need to dig into whether Linux belongs to some systems. Right now, just apply the patch to eyeD3’s __init__.py:

37c37
< LOCAL_ENCODING = locale.getpreferredencoding(do_setlocale=False);
---
> LOCAL_ENCODING = locale.getpreferredencoding(do_setlocale=True);

And remember to specify the encoding of tags using –set-encoding , RTFM for more details.

Migrate to MTP

Desktop, Gentoo December 15th, 2007

About nine months ago, I tried to embrace MTP since Creative Lab does not support their own proprietorial protocol(libnjb is the open source implementation) in Windows Vista, and I was really frustrated by the lame upgrade support.

Here is a chance for me to get a 2nd generation Zune which is powered by MTP. Although libmtp is still in very early stage to bridge the gap, we could predict its future from the history.

Amarok supports MTP if the USE flag mtp is enabled. However, the latest stable version 1.4.7-r2 has a bug when transfer file with CJK characters. The bug happens when interfacing with libmtp:

int ret = LIBMTP_Send_Track_From_File(
-        m_device, bundle.url().path().latin1(), trackmeta,
+        m_device, bundle.url().path().local8Bit(), trackmeta,
         progressCallback, this, parent_id
     );

The bug is fixed in SVN (ticket), but if we take non-utf8-locale users into account, for example, MagicLinux takes GB2312 as the default locale, local8Bit may have more flexibility than hard-coded utf8, and it also worked in the UTF8 environment.

Here is the patch , and as usual, an ebuild for Gentoo users (manual).

HOWTO convert Chinese MP3 for ID3 v2.3 standard

Gentoo November 7th, 2007

Amarok developers probably barely thought about the response from the Chinese users when they eventually dropped the id3 tag codec detection, and enforced ID3v2 specification. “Amarok is dead”, claimed in linuxfans.org, the community-powered Magic Linux support forum. Why? Quite a few MP3 files are encoded in GB2312 on id3v1 in China and even worse, some files are encoded with GB2312 in ID3 v2.3 format. What a mess!

I respect their decision, the player has no responsibility to clean the shit of lousy encoders, but we need to face the reality by all means. Here is my cruel life: Amarok is preferred in Linux, occasionally I am using mpg123 in console mode; using foobar2000 in Windows, sometimes Windows Media Player; portable MP3 player is Creative Zen Micro. No Mac, no iPod. To make things even worse, the locale in Linux is utf8, while in Windows, it is utf16-le. Last but not the least, I do respect specification.

So ID3v1 is not considered, it only supports ISO8859-1, that make it impossible to hold CJK characters. For ID3v2, the most popular version is v2.3, unfortunately, it does not support utf8 encoding. v2.4 supports this codec, but it is seldom picked up by the hardware manufacturer or the application developers.

Let’s start from the latest specification. ID3 v2.4:

The first bad news is a de facto id3v2 implementation, id3v2-0.1.11 does not support v2.4. That cost several hours to figure out why the newly added v2.4 disappeared mysterically, the answer is id3v2 is even unable to recognize v2.4 tags. EyeD3 is the remedy, this pure python library provides a very neat command line utility to manipulate id3 v2.4 tags. The good news is Creative Zen Micro support v2.4. In fact, I am not quite sure whether the honor goes to Creative Lab, or the libnjb developers.

Another option is v2.3, most popular implementation so far. Unfortunately, it only supports unicode-LE(i.e the default locale of Microsoft Windows), unicode-be and latin-1, no UTF-8 support. To make it even worse, id3v2 writes to the tag regardless the locale, that is really horrible!

Here is my effort to address this problem, eyeD3conv, as the name suggest, it depends on eyeD3 library. This small utility will convert mistaken-encoded tags to standard Unicode16-LE ID3 v2.3 tag.

And you need to apply this patch to fix the encoding bug in eyeD3-0.6.14. The patch has been submitted to the upstream.

Update: thanks to the author of eyeD3, Travis’ quick response, according to the specification, the url is supposed to be encoded in ascii, so we can simply ignore the URLFrame. Forget the patch, and use theupdated-version.

Other mis-encoded frames may throw an UnicodeDecode exception when frame is read/written that cancels the succeeding file rename action. Here are some pragmatic tips to work around this issue:

# remove all comments
eyeDe –remove-comments foo.mp3
# remove WXXX frame
eyeDe –set-text-frame=“WXXX:” foo.mp3

No idea which application inserts such crap into the tag.

HOWTO translate Gentoo Documentation

Gentoo August 20th, 2007

Recently, I am involved in Gentoo Guide translation. This is more or less like a new domain for me since I’ve never worked in localization. It is quite amazing to see how different pieces integrated to make our lives much easier. This HOWTO is merely the summary of this documentation, left here for personal record.

Head over to the source

You can use git to sync the repository, or download it from online CVS repository. In fact, there is a simpler way to do so referred to Gentoo XML Guide: just add ?passthru=1 to the targeted GuideXML. For example, I am translating the Gentoo Linux ALSA Guide, fetch the original xml via:

wget http://www.gentoo.org/doc/en/alsa-guide.xml?passthru=1 -O alsa-guide.xml

.

However, we need the other pieces like xsl, css, dtd, so we may need checkout all of them from the repository:

cvs -d :pserver:anonymous@anoncvs.gentoo.org/var/cvsroot co gentoo/xml/htdocs/dtd
cvs -d :pserver:anonymous@anoncvs.gentoo.org/var/cvsroot co gentoo/xml/htdocs/css
cvs -d :pserver:anonymous@anoncvs.gentoo.org/var/cvsroot co gentoo/xml/htdocs/xsl
cvs -d :pserver:anonymous@anoncvs.gentoo.org/var/cvsroot co gentoo/xml/htdocs/doc/en/alsa-guide.xml

Using po to translate

po is the standard for translation. We are using po4a as suggested. First, let’s move to the right path, i.e $ROOT/gentoo/xml/htdocs/doc, and the translated xml is going to stored in doc/zh_CN.

# generate the po from the original xml
po4a-gettextize -f guide -m en/alsa-guide.xml > zh_CN/alsa-guide.po

emacs has built-in supports for po mode. I am using vim, and found the po.vim plugin is quite neat as well.

Check before commit

po or GuideXML are designed for machines, not for human being. It is much easier for eyes to catch the errors when reading rendered HTML.

# generate the translated xml using po
po4a-translate -f guide -m en/alsa-guide.xml -p zh_CN/alsa-guide.po -k 1 > zh_CN/alsa-guide.xml

# check the output using xslt
xsltproc –path ../xsl:en ../xsl/guide.xsl zh_CN/alsa-guide.xml > zh_CN/alsa-guide.html

Before your checkin, ensure you have read through Translators Howto for Gentoo Documentation.

Happy localizing.