<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 11/17/2015 03:56 PM, Lars Skjærlund
wrote:<br>
</div>
<blockquote
cite="mid:0A4038BC4498B948B29B1363F84FD0250294751D@chimp.dbc.dk"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Verdana;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"Lucida Console";
panose-1:2 11 6 9 4 5 4 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:3.0cm 2.0cm 3.0cm 2.0cm;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I’m afraid I’ve hit a
bug:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I want to migrate our
Kallithea database from SQLite to MySQL. In order to do
that, I dumped the SQLite database to an SQL script,
modified the SQL commands to MySQL dialect, and ran the
script against the MySQL database.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">It worked like a charm –
except that Kallithea kept crashing with Unicode errors.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">But everything _<i>was</i>_
Unicode: The dump from SQLite was Unicode, my edits where
fully Unicode compatible, and the database as well as the
tables where created in MySQL as UTF8 compatible. After
fighting this for a long time, I tried letting Kallithea
populate a new MySQL database – and discovered that
Kallithea doesn’t store data in UTF8 format. It appears that
the data is encoded for UTF8 twice, so my record looks like<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Lucida
Console"" lang="EN-US">+-----------+--------------+<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Lucida
Console"" lang="EN-US">| firstname | lastname |<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Lucida
Console"" lang="EN-US">+-----------+--------------+<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Lucida
Console"" lang="EN-US">| Lars | Skjærlund |<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Lucida
Console"" lang="EN-US">+-----------+--------------+<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">If update my name to be
true UTF8, Kallithea crashes. I haven’t tried other
databases, but the encoding in SQLite is correct.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I solved my problem by
running the SQL scriptfile through iconv before submitting
it to MySQL, claiming the input was Latin1 and asking for
UTF8 as output: In that way I got the same double-encoding
that Kallithea appears to require…</span></p>
</div>
</blockquote>
<br>
Generally Kallithea works fine with unicode. It can however be
tricky when it is interfacing with VCS or database. It is my
impression that mysql also just works, but I use postgresql and
haven't tried mysql myself.<br>
<br>
If the database really is in utf8, I guess some other layer in the
stack (sqlalchemy or the database driver) messes it up.<br>
<br>
These two issue reports might give hints of what to check<br>
<a class="moz-txt-link-freetext" href="https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog">https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog</a><br>
<a class="moz-txt-link-freetext" href="https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode">https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode</a><br>
<br>
When debugging the problem, it might be simpler to let Kallithea
create a new database and focus on whether it can store and read
unicode. Next, you can make sure your converted database use the
same encoding.<br>
<br>
/Mads<br>
</body>
</html>