SV: Bug in MySQL code?

Lars Skjærlund las at dbc.dk
Wed Nov 18 08:52:40 UTC 2015


Hi Mads,


Ø  When debugging the problem, it might be simpler to let Kallithea create a new database and focus on whether it can store and read unicode.

That’s what I did before reporting the bug: Kallithea does _not_ store data in Unicode format – and if you add new data to the database in Unicode, Kallithea breaks.

If I open the MySQL client and do, say, “update users set lastname = ‘Skjærlund’ where firstname = ‘Lars’”, then Kallithea breaks because the MySQL client adds data in Unicode.

If the above command is to work, I have to type “update users set lastname = ‘Skjærlund’ where firstname = ‘Lars’”. In that case Kallithea survives – and it displays my name correctly in the UI.


[mail_logo]<http://www.dbc.dk/>

Med venlig hilsen
Lars Skjærlund
DevOps
Tlf.: 44 86 77 77
DBC as

www.dbc.dk<http://www.dbc.dk/>
las at dbc.dk<mailto:las at dbc.dk>

Fra: Mads Kiilerich [mailto:mads at kiilerich.com]
Sendt: 17. november 2015 18:13
Til: Lars Skjærlund <las at dbc.dk>; kallithea-general at sfconservancy.org
Emne: Re: Bug in MySQL code?

On 11/17/2015 03:56 PM, Lars Skjærlund wrote:
Hi,

I’m afraid I’ve hit a bug:

I want to migrate our Kallithea database from SQLite to MySQL. In order to do that, I dumped the SQLite database to an SQL script, modified the SQL commands to MySQL dialect, and ran the script against the MySQL database.

It worked like a charm – except that Kallithea kept crashing with Unicode errors.

But everything _was_ Unicode: The dump from SQLite was Unicode, my edits where fully Unicode compatible, and the database as well as the tables where created in MySQL as UTF8 compatible. After fighting this for a long time, I tried letting Kallithea populate a new MySQL database – and discovered that Kallithea doesn’t store data in UTF8 format. It appears that the data is encoded for UTF8 twice, so my record looks like

+-----------+--------------+
| firstname | lastname     |
+-----------+--------------+
| Lars      | Skjærlund   |
+-----------+--------------+

If update my name to be true UTF8, Kallithea crashes. I haven’t tried other databases, but the encoding in SQLite is correct.

I solved my problem by running the SQL scriptfile through iconv before submitting it to MySQL, claiming the input was Latin1 and asking for UTF8 as output: In that way I got the same double-encoding that Kallithea appears to require…

Generally Kallithea works fine with unicode. It can however be tricky when it is interfacing with VCS or database. It is my impression that mysql also just works, but I use postgresql and haven't tried mysql myself.

If the database really is in utf8, I guess some other layer in the stack (sqlalchemy or the database driver) messes it up.

These two issue reports might give hints of what to check
https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog
https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode

When debugging the problem, it might be simpler to let Kallithea create a new database and focus on whether it can store and read unicode. Next, you can make sure your converted database use the same encoding.

/Mads
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151118/87377c1b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 2907 bytes
Desc: image002.png
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151118/87377c1b/attachment-0001.png>


More information about the kallithea-general mailing list