<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 11/27/2015 01:33 PM, Dominik Ruf
wrote:<br>
</div>
<blockquote
cite="mid:CAAfZa5=2+r_pCgarriU_fv+hUbEOzG9meM-GP5sr4YpHqxOk7A@mail.gmail.com"
type="cite">
<p dir="ltr">Great.<br>
BTW I made another test and it seems the key thing is
charset=utf8.<br>
</p>
</blockquote>
<br>
TLDR: Lars is right that a default Kallithea installation on MySQL
stores utf-8 in the database instead of storing unicode and letting
the database deal with the encoding. I was also right that it
generally works fine anyway. ;-)<br>
<br>
I also tested (with Fedora, mariadb and mysql-python). I tested by
creating a new database, changing the admin users name to
blåbærgrød, creating a blåbærgrød repository, and inspecting
database and file system content.<br>
<br>
Everything worked flawlessly with the default mysql url. Only with
the caveat that it stores utf-8 in the database. Sqlalchemy will
however encode and decode it consistently so everything just works
... but I guess collation order and other "details" might be wrong
and direct database hacking will be tricky - as Lars found out the
hard way in the initial post.<br>
<br>
I agree that <br>
sqlalchemy.db1.url =
mysql://kallithea:foobar@localhost/kallithea?charset=utf8<br>
seems to be the right "solution". It works and the database content
is as expected. (Except that this however apparently not is fully
unicode compliant and it would be better to use utf8mb4 ...)<br>
<br>
I don't know the root cause of the weirdness. It might be some (old
and fixed?) MySQL deficiencies and workarounds in SqlAlchemy ... or
something in Kallithea that triggers it. I guess it could be the
combination of mysql not being unicode compliant by default and
convert_unicode thus triggering the unnecessary utf8 encoding.
(<a class="moz-txt-link-freetext" href="http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine.params.encoding">http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine.params.encoding</a>
could also seem to play a role ... but probably only relevant for
understanding.)<br>
<br>
I guess we should change the default mysql uri in the .ini files to
use charset=utf8?<br>
<br>
Each table already specifies mysql_charset utf8 ... but that is
apparently for something else?<br>
<br>
We should probably also improve the documentation to give some
advice of which "DBAPI" to use. Any recommendations?<br>
<br>
I guess we also should get rid of all the explicit convert_unicode
in db.py and .ini and just use Unicode and UnicodeText fields.<br>
<br>
Changes in this area could however cause pain for installations that
happily are using mysql with double encoding.<br>
<br>
/Mads<br>
<code><br>
</code><code class="docutils literal"><span class="pre"></span></code>
</body>
</html>