Bug in MySQL code?

Mads Kiilerich mads at kiilerich.com
Tue Nov 17 17:13:01 UTC 2015

On 11/17/2015 03:56 PM, Lars Skjærlund wrote:
> Hi,
> I’m afraid I’ve hit a bug:
> I want to migrate our Kallithea database from SQLite to MySQL. In 
> order to do that, I dumped the SQLite database to an SQL script, 
> modified the SQL commands to MySQL dialect, and ran the script against 
> the MySQL database.
> It worked like a charm – except that Kallithea kept crashing with 
> Unicode errors.
> But everything _/was/_ Unicode: The dump from SQLite was Unicode, my 
> edits where fully Unicode compatible, and the database as well as the 
> tables where created in MySQL as UTF8 compatible. After fighting this 
> for a long time, I tried letting Kallithea populate a new MySQL 
> database – and discovered that Kallithea doesn’t store data in UTF8 
> format. It appears that the data is encoded for UTF8 twice, so my 
> record looks like
> +-----------+--------------+
> | firstname | lastname     |
> +-----------+--------------+
> | Lars      | Skjærlund   |
> +-----------+--------------+
> If update my name to be true UTF8, Kallithea crashes. I haven’t tried 
> other databases, but the encoding in SQLite is correct.
> I solved my problem by running the SQL scriptfile through iconv before 
> submitting it to MySQL, claiming the input was Latin1 and asking for 
> UTF8 as output: In that way I got the same double-encoding that 
> Kallithea appears to require…

Generally Kallithea works fine with unicode. It can however be tricky 
when it is interfacing with VCS or database. It is my impression that 
mysql also just works, but I use postgresql and haven't tried mysql myself.

If the database really is in utf8, I guess some other layer in the stack 
(sqlalchemy or the database driver) messes it up.

These two issue reports might give hints of what to check

When debugging the problem, it might be simpler to let Kallithea create 
a new database and focus on whether it can store and read unicode. Next, 
you can make sure your converted database use the same encoding.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151117/0ea1812b/attachment.html>

More information about the kallithea-general mailing list