Bug in MySQL code?
Mads Kiilerich
mads at kiilerich.com
Tue Nov 17 17:13:01 UTC 2015
On 11/17/2015 03:56 PM, Lars Skjærlund wrote:
>
> Hi,
>
> I’m afraid I’ve hit a bug:
>
> I want to migrate our Kallithea database from SQLite to MySQL. In
> order to do that, I dumped the SQLite database to an SQL script,
> modified the SQL commands to MySQL dialect, and ran the script against
> the MySQL database.
>
> It worked like a charm – except that Kallithea kept crashing with
> Unicode errors.
>
> But everything _/was/_ Unicode: The dump from SQLite was Unicode, my
> edits where fully Unicode compatible, and the database as well as the
> tables where created in MySQL as UTF8 compatible. After fighting this
> for a long time, I tried letting Kallithea populate a new MySQL
> database – and discovered that Kallithea doesn’t store data in UTF8
> format. It appears that the data is encoded for UTF8 twice, so my
> record looks like
>
> +-----------+--------------+
>
> | firstname | lastname |
>
> +-----------+--------------+
>
> | Lars | Skjærlund |
>
> +-----------+--------------+
>
> If update my name to be true UTF8, Kallithea crashes. I haven’t tried
> other databases, but the encoding in SQLite is correct.
>
> I solved my problem by running the SQL scriptfile through iconv before
> submitting it to MySQL, claiming the input was Latin1 and asking for
> UTF8 as output: In that way I got the same double-encoding that
> Kallithea appears to require…
>
Generally Kallithea works fine with unicode. It can however be tricky
when it is interfacing with VCS or database. It is my impression that
mysql also just works, but I use postgresql and haven't tried mysql myself.
If the database really is in utf8, I guess some other layer in the stack
(sqlalchemy or the database driver) messes it up.
These two issue reports might give hints of what to check
https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog
https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode
When debugging the problem, it might be simpler to let Kallithea create
a new database and focus on whether it can store and read unicode. Next,
you can make sure your converted database use the same encoding.
/Mads
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151117/0ea1812b/attachment.html>
More information about the kallithea-general
mailing list