Bug in MySQL code?

Mads Kiilerich mads at kiilerich.com
Tue Nov 17 17:13:01 UTC 2015


On 11/17/2015 03:56 PM, Lars Skjærlund wrote:
>
> Hi,
>
> I’m afraid I’ve hit a bug:
>
> I want to migrate our Kallithea database from SQLite to MySQL. In 
> order to do that, I dumped the SQLite database to an SQL script, 
> modified the SQL commands to MySQL dialect, and ran the script against 
> the MySQL database.
>
> It worked like a charm – except that Kallithea kept crashing with 
> Unicode errors.
>
> But everything _/was/_ Unicode: The dump from SQLite was Unicode, my 
> edits where fully Unicode compatible, and the database as well as the 
> tables where created in MySQL as UTF8 compatible. After fighting this 
> for a long time, I tried letting Kallithea populate a new MySQL 
> database – and discovered that Kallithea doesn’t store data in UTF8 
> format. It appears that the data is encoded for UTF8 twice, so my 
> record looks like
>
> +-----------+--------------+
>
> | firstname | lastname     |
>
> +-----------+--------------+
>
> | Lars      | Skjærlund   |
>
> +-----------+--------------+
>
> If update my name to be true UTF8, Kallithea crashes. I haven’t tried 
> other databases, but the encoding in SQLite is correct.
>
> I solved my problem by running the SQL scriptfile through iconv before 
> submitting it to MySQL, claiming the input was Latin1 and asking for 
> UTF8 as output: In that way I got the same double-encoding that 
> Kallithea appears to require…
>

Generally Kallithea works fine with unicode. It can however be tricky 
when it is interfacing with VCS or database. It is my impression that 
mysql also just works, but I use postgresql and haven't tried mysql myself.

If the database really is in utf8, I guess some other layer in the stack 
(sqlalchemy or the database driver) messes it up.

These two issue reports might give hints of what to check
https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog
https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode

When debugging the problem, it might be simpler to let Kallithea create 
a new database and focus on whether it can store and read unicode. Next, 
you can make sure your converted database use the same encoding.

/Mads
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151117/0ea1812b/attachment.html>


More information about the kallithea-general mailing list