SV: SV: Bug in MySQL code?

Lars Skjærlund las at dbc.dk
Thu Nov 19 08:24:44 UTC 2015


Hi Mads,

So far, I’m testing in an isolated environment on a virtual machine reserved for this purporse.

The machine is running Debian 8 Jessie being patched and updated as of today.

The database is MariaDB:

Server version: 10.0.22-MariaDB-0+deb8u1 (Debian)

I’m running Kallithea under Apache WSGI using the following WSGI file:

import os, os.path

kallithea_home = '/var/lib/kallithea/'

activate_this = os.path.join(kallithea_home, 'venv/bin/activate_this.py');
execfile(activate_this,dict(__file__=activate_this))

os.environ['HOME'] = kallithea_home

ini = os.path.join(kallithea_home, 'kallithea.ini')
from paste.script.util.logging_config import fileConfig
fileConfig(ini)
from paste.deploy import loadapp
application = loadapp('config:' + ini)

Apache is configured like this:

<IfModule mod_ssl.c>
  <VirtualHost _default_:443>
    ServerAdmin webmaster at dbc.dk
    ServerName git.dbc.dk

    WSGIDaemonProcess kallithea processes=1 threads=4 user=www-data group=www-data
    WSGIScriptAlias / /var/lib/kallithea/kallithea.wsgi
    WSGIPassAuthorization On

    <Location />
      WSGIProcessGroup kallithea
      Require all granted
    </Location>

    Alias /repos /data

    <Directory /data>
      Options Indexes FollowSymLinks MultiViews
      AllowOverride None
      Require all granted
    </Directory>

    <Location /ldap-status>
      SetHandler        ldap-status
      AuthType          Basic
      AuthName          "DBC Git server"
      AuthBasicProvider ldap
      AuthLDAPURL       ldap://xxx.dbc.dk/ou=DBCNTDOM,dc=dbc,dc=dk?mailNickname?sub?(objectClass=*)
      AuthLDAPBindDN    "ldaplookup at dbc.dk"
      AuthLDAPBindPassword XXXXXXXX
      Require           valid-user
    </Location>

    ErrorLog ${APACHE_LOG_DIR}/error.log

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog ${APACHE_LOG_DIR}/ssl_access.log combined

    SSLEngine on

    SSLCertificateFile          /etc/xxx/yyy.crt
    SSLCertificateKeyFile       /etc/xxx/yyy.key
    SSLCertificateChainFile     /etc/xxx/yyy.pem

    #SSLOptions +FakeBasicAuth +ExportCertData +StrictRequire
    <FilesMatch "\.(cgi|shtml|phtml|php)$">
      SSLOptions +StdEnvVars
    </FilesMatch>
    <Directory /usr/lib/cgi-bin>
      SSLOptions +StdEnvVars
    </Directory>

    BrowserMatch "MSIE [2-6]" \
      nokeepalive ssl-unclean-shutdown \
      downgrade-1.0 force-response-1.0
    # MSIE 7 and newer should be able to use keepalive
    BrowserMatch "MSIE [17-9]" ssl-unclean-shutdown
  </VirtualHost>
</IfModule>

Apache server version:

Server Version: Apache/2.4.10 (Debian) OpenSSL/1.0.1k mod_wsgi/4.3.0 Python/2.7.9

As can be seen here, Python version is 2.7.9.

‘pip list’ shows the following modules for the virtual environment:

amqplib (1.0.2)
anyjson (0.3.3)
Babel (1.3)
Beaker (1.6.4)
celery (2.2.10)
decorator (4.0.4)
docutils (0.11)
dulwich (0.9.9)
FormEncode (1.2.6)
funcsigs (0.4)
Kallithea (0.3)
kombu (1.5.1)
Mako (1.0.0)
Markdown (2.2.1)
MarkupSafe (0.23)
mercurial (3.5.2)
mock (1.3.0)
MySQL-python (1.2.5)
nose (1.3.7)
oursql (0.9.3.1)
Paste (2.0.2)
PasteDeploy (1.5.2)
PasteScript (2.0.2)
pbr (1.8.1)
pip (7.1.2)
py-bcrypt (0.4)
pycrypto (2.6.1)
Pygments (2.0.2)
Pylons (1.0.2)
pyparsing (1.5.7)
python-dateutil (1.5)
python-ldap (2.4.22)
pytz (2015.7)
repoze.lru (0.6)
Routes (1.13)
setuptools (18.5)
simplejson (3.8.1)
six (1.10.0)
SQLAlchemy (0.7.10)
sqlalchemygrate (0.3)
Tempita (0.5.2)
URLObject (2.3.4)
waitress (0.8.8)
WebError (0.11)
WebHelpers (1.3)
WebOb (1.1.1)
WebTest (1.4.3)
Whoosh (2.5.7)

Finally, the ini file looks like this:

################################################################################
################################################################################
# Kallithea - Example config                                                   #
#                                                                              #
# The %(here)s variable will be replaced with the parent directory of this file#
################################################################################
################################################################################

[DEFAULT]
debug = true
pdebug = false

################################################################################
## Email settings                                                             ##
##                                                                            ##
## Refer to the documentation ("Email settings") for more details.            ##
##                                                                            ##
## It is recommended to use a valid sender address that passes access         ##
## validation and spam filtering in mail servers.                             ##
################################################################################

## 'From' header for application emails. You can optionally add a name.
## Default:
app_email_from = Kallithea-noreply at dbc.dk
## Examples:
#app_email_from = Kallithea <kallithea-noreply at example.com>
#app_email_from = kallithea-noreply at example.com

## Subject prefix for application emails.
## A space between this prefix and the real subject is automatically added.
## Default:
#email_prefix =
## Example:
email_prefix = [Kallithea]

## Recipients for error emails and fallback recipients of application mails.
## Multiple addresses can be specified, space-separated.
## Only addresses are allowed, do not add any name part.
## Default:
email_to = las at dbc.dk
## Examples:
#email_to = admin at example.com
#email_to = admin at example.com another_admin at example.com

## 'From' header for error emails. You can optionally add a name.
## Default:
#error_email_from = pylons at yourapp.com
## Examples:
#error_email_from = Kallithea Errors <kallithea-noreply at example.com>
#error_email_from = paste_error at example.com

## SMTP server settings
## Only smtp_server is mandatory. All other settings take the specified default
## values.
smtp_server = mail.dbc.dk
#smtp_username =
#smtp_password =
#smtp_port = 25
#smtp_use_tls = false
#smtp_use_ssl = false
## SMTP authentication parameters to use (e.g. LOGIN PLAIN CRAM-MD5, etc.).
## If empty, use any of the authentication parameters supported by the server.
#smtp_auth =

[server:main]
## PASTE ##
#use = egg:Paste#http
## nr of worker threads to spawn
#threadpool_workers = 5
## max request before thread respawn
#threadpool_max_requests = 10
## option to use threads of process
#use_threadpool = true

## WAITRESS ##
use = egg:waitress#main
## number of worker threads
threads = 5
## MAX BODY SIZE 100GB
max_request_body_size = 107374182400
## use poll instead of select, fixes fd limits, may not work on old
## windows systems.
#asyncore_use_poll = True

## GUNICORN ##
#use = egg:gunicorn#main
## number of process workers. You must set `instance_id = *` when this option
## is set to more than one worker
#workers = 1
## process name
#proc_name = kallithea
## type of worker class, one of sync, eventlet, gevent, tornado
## recommended for bigger setup is using of of other than sync one
#worker_class = sync
#max_requests = 1000
## ammount of time a worker can handle request before it gets killed and
## restarted
#timeout = 3600

## UWSGI ##
## run with uwsgi --ini-paste-logged <inifile.ini>
#[uwsgi]
#socket = /tmp/uwsgi.sock
#master = true
#http = 127.0.0.1:5000

## set as deamon and redirect all output to file
#daemonize = ./uwsgi_kallithea.log

## master process PID
#pidfile = ./uwsgi_kallithea.pid

## stats server with workers statistics, use uwsgitop
## for monitoring, `uwsgitop 127.0.0.1:1717`
#stats = 127.0.0.1:1717
#memory-report = true

## log 5XX errors
#log-5xx = true

## Set the socket listen queue size.
#listen = 256

## Gracefully Reload workers after the specified amount of managed requests
## (avoid memory leaks).
#max-requests = 1000

## enable large buffers
#buffer-size = 65535

## socket and http timeouts ##
#http-timeout = 3600
#socket-timeout = 3600

## Log requests slower than the specified number of milliseconds.
#log-slow = 10

## Exit if no app can be loaded.
#need-app = true

## Set lazy mode (load apps in workers instead of master).
#lazy = true

## scaling ##
## set cheaper algorithm to use, if not set default will be used
#cheaper-algo = spare

## minimum number of workers to keep at all times
#cheaper = 1

## number of workers to spawn at startup
#cheaper-initial = 1

## maximum number of workers that can be spawned
#workers = 4

## how many workers should be spawned at a time
#cheaper-step = 1

## COMMON ##
host = 127.0.0.1
port = 5000

## middleware for hosting the WSGI application under a URL prefix
#[filter:proxy-prefix]
#use = egg:PasteDeploy#prefix
#prefix = /<your-prefix>

[app:main]
use = egg:kallithea
## enable proxy prefix middleware
#filter-with = proxy-prefix

full_stack = true
static_files = true
## Available Languages:
## cs de fr hu ja nl_BE pl pt_BR ru sk zh_CN zh_TW
lang =
cache_dir = %(here)s/data
index_dir = %(here)s/data/index

## perform a full repository scan on each server start, this should be
## set to false after first startup, to allow faster server restarts.
initial_repo_scan = false

## uncomment and set this path to use archive download cache
archive_cache_dir = %(here)s/tarballcache

## change this to unique ID for security
app_instance_uuid = 964cb97f-9d4b-44bb-bb8a-5980d553f659

## cut off limit for large diffs (size in bytes)
cut_off_limit = 256000

## use cache version of scm repo everywhere
vcs_full_cache = true

## force https in Kallithea, fixes https redirects, assumes it's always https
force_https = false

## use Strict-Transport-Security headers
use_htsts = false

## number of commits stats will parse on each iteration
commit_parse_limit = 25

## path to git executable
git_path = git

## git rev filter option, --all is the default filter, if you need to
## hide all refs in changelog switch this to --branches --tags
#git_rev_filter = --branches --tags

## RSS feed options
rss_cut_off_limit = 256000
rss_items_per_page = 10
rss_include_diff = false

## options for showing and identifying changesets
show_sha_length = 12
show_revision_number = false

## gist URL alias, used to create nicer urls for gist. This should be an
## url that does rewrites to _admin/gists/<gistid>.
## example: http://gist.example.com/{gistid}. Empty means use the internal
## Kallithea url, ie. http[s]://kallithea.example.com/_admin/gists/<gistid>
gist_alias_url =

## white list of API enabled controllers. This allows to add list of
## controllers to which access will be enabled by api_key. eg: to enable
## api access to raw_files put `FilesController:raw`, to enable access to patches
## add `ChangesetController:changeset_patch`. This list should be "," separated
## Syntax is <ControllerClass>:<function>. Check debug logs for generated names
## Recommended settings below are commented out:
api_access_controllers_whitelist =
#    ChangesetController:changeset_patch,
#    ChangesetController:changeset_raw,
#    FilesController:raw,
#    FilesController:archivefile

## default encoding used to convert from and to unicode
## can be also a comma seperated list of encoding in case of mixed encodings
default_encoding = utf8

## issue tracker for Kallithea (leave blank to disable, absent for default)
#bugtracker = https://bitbucket.org/conservancy/kallithea/issues

## issue tracking mapping for commits messages
## comment out issue_pat, issue_server, issue_prefix to enable

## pattern to get the issues from commit messages
## default one used here is #<numbers> with a regex passive group for `#`
## {id} will be all groups matched from this pattern

issue_pat = (?:\s*#)(\d+)

## server url to the issue, each {id} will be replaced with match
## fetched from the regex and {repo} is replaced with full repository name
## including groups {repo_name} is replaced with just name of repo

issue_server_link = https://issues.example.com/{repo}/issue/{id}

## prefix to add to link to indicate it's an url
## #314 will be replaced by <issue_prefix><id>

issue_prefix = #

## issue_pat, issue_server_link, issue_prefix can have suffixes to specify
## multiple patterns, to other issues server, wiki or others
## below an example how to create a wiki pattern
# wiki-some-id -> https://wiki.example.com/some-id

#issue_pat_wiki = (?:wiki-)(.+)
#issue_server_link_wiki = https://wiki.example.com/{id}
#issue_prefix_wiki = WIKI-

## instance-id prefix
## a prefix key for this instance used for cache invalidation when running
## multiple instances of kallithea, make sure it's globally unique for
## all running kallithea instances. Leave empty if you don't use it
instance_id =

## alternative return HTTP header for failed authentication. Default HTTP
## response is 401 HTTPUnauthorized. Currently Mercurial clients have trouble with
## handling that. Set this variable to 403 to return HTTPForbidden
auth_ret_code =

## locking return code. When repository is locked return this HTTP code. 2XX
## codes don't break the transactions while 4XX codes do
lock_ret_code = 423

## allows to change the repository location in settings page
allow_repo_location_change = True

## allows to setup custom hooks in settings page
allow_custom_hooks_settings = True

####################################
###        CELERY CONFIG        ####
####################################

use_celery = false
broker.host = localhost
broker.vhost = rabbitmqhost
broker.port = 5672
broker.user = rabbitmq
broker.password = qweqwe

celery.imports = kallithea.lib.celerylib.tasks

celery.result.backend = amqp
celery.result.dburi = amqp://
celery.result.serialier = json

#celery.send.task.error.emails = true
#celery.amqp.task.result.expires = 18000

celeryd.concurrency = 2
#celeryd.log.file = celeryd.log
celeryd.log.level = DEBUG
celeryd.max.tasks.per.child = 1

## tasks will never be sent to the queue, but executed locally instead.
celery.always.eager = false

####################################
###         BEAKER CACHE        ####
####################################

beaker.cache.data_dir = %(here)s/data/cache/data
beaker.cache.lock_dir = %(here)s/data/cache/lock

beaker.cache.regions = short_term,long_term,sql_cache_short

beaker.cache.short_term.type = memory
beaker.cache.short_term.expire = 60
beaker.cache.short_term.key_length = 256

beaker.cache.long_term.type = memory
beaker.cache.long_term.expire = 36000
beaker.cache.long_term.key_length = 256

beaker.cache.sql_cache_short.type = memory
beaker.cache.sql_cache_short.expire = 10
beaker.cache.sql_cache_short.key_length = 256

####################################
###       BEAKER SESSION        ####
####################################

## Name of session cookie. Should be unique for a given host and path, even when running
## on different ports. Otherwise, cookie sessions will be shared and messed up.
beaker.session.key = kallithea
## Sessions should always only be accessible by the browser, not directly by JavaScript.
beaker.session.httponly = true
## Session lifetime. 2592000 seconds is 30 days.
beaker.session.timeout = 2592000

## Server secret used with HMAC to ensure integrity of cookies.
beaker.session.secret = 964cb97f-9d4b-44bb-bb8a-5980d553f659
## Further, encrypt the data with AES.
#beaker.session.encrypt_key = <key_for_encryption>
#beaker.session.validate_key = <validation_key>

## Type of storage used for the session, current types are
## dbm, file, memcached, database, and memory.

## File system storage of session data. (default)
#beaker.session.type = file

## Cookie only, store all session data inside the cookie. Requires secure secrets.
#beaker.session.type = cookie

## Database storage of session data.
#beaker.session.type = ext:database
#beaker.session.sa.url = postgresql://postgres:qwe@localhost/kallithea
#beaker.session.table_name = db_session

############################
## ERROR HANDLING SYSTEMS ##
############################

####################
### [errormator] ###
####################

## Errormator is tailored to work with Kallithea, see
## http://errormator.com for details how to obtain an account
## you must install python package `errormator_client` to make it work

## errormator enabled
errormator = false

errormator.server_url = https://api.errormator.com
errormator.api_key = YOUR_API_KEY

## TWEAK AMOUNT OF INFO SENT HERE

## enables 404 error logging (default False)
errormator.report_404 = false

## time in seconds after request is considered being slow (default 1)
errormator.slow_request_time = 1

## record slow requests in application
## (needs to be enabled for slow datastore recording and time tracking)
errormator.slow_requests = true

## enable hooking to application loggers
#errormator.logging = true

## minimum log level for log capture
#errormator.logging.level = WARNING

## send logs only from erroneous/slow requests
## (saves API quota for intensive logging)
errormator.logging_on_error = false

## list of additonal keywords that should be grabbed from environ object
## can be string with comma separated list of words in lowercase
## (by default client will always send following info:
## 'REMOTE_USER', 'REMOTE_ADDR', 'SERVER_NAME', 'CONTENT_TYPE' + all keys that
## start with HTTP* this list be extended with additional keywords here
errormator.environ_keys_whitelist =

## list of keywords that should be blanked from request object
## can be string with comma separated list of words in lowercase
## (by default client will always blank keys that contain following words
## 'password', 'passwd', 'pwd', 'auth_tkt', 'secret', 'csrf'
## this list be extended with additional keywords set here
errormator.request_keys_blacklist =

## list of namespaces that should be ignores when gathering log entries
## can be string with comma separated list of namespaces
## (by default the client ignores own entries: errormator_client.client)
errormator.log_namespace_blacklist =

################
### [sentry] ###
################

## sentry is a alternative open source error aggregator
## you must install python packages `sentry` and `raven` to enable

sentry.dsn = YOUR_DNS
sentry.servers =
sentry.name =
sentry.key =
sentry.public_key =
sentry.secret_key =
sentry.project =
sentry.site =
sentry.include_paths =
sentry.exclude_paths =

################################################################################
## WARNING: *THE LINE BELOW MUST BE UNCOMMENTED ON A PRODUCTION ENVIRONMENT*  ##
## Debug mode will enable the interactive debugging tool, allowing ANYONE to  ##
## execute malicious code after an exception is raised.                       ##
################################################################################
set debug = false

##################################
###       LOGVIEW CONFIG       ###
##################################

logview.sqlalchemy = #faa
logview.pylons.templating = #bfb
logview.pylons.util = #eee

#########################################################
### DB CONFIGS - EACH DB WILL HAVE IT'S OWN CONFIG    ###
#########################################################

# SQLITE [default]
#sqlalchemy.db1.url = sqlite:///%(here)s/kallithea.db?timeout=60

# POSTGRESQL
#sqlalchemy.db1.url = postgresql://user:pass@localhost/kallithea

# MySQL
sqlalchemy.db1.url = mysql://kallithea:kallithea@localhost/kallithea

# see sqlalchemy docs for others

sqlalchemy.db1.echo = false
sqlalchemy.db1.pool_recycle = 3600
sqlalchemy.db1.convert_unicode = true

################################
### LOGGING CONFIGURATION   ####
################################

[loggers]
keys = root, routes, kallithea, sqlalchemy, beaker, templates, whoosh_indexer

[handlers]
keys = console, console_sql

[formatters]
keys = generic, color_formatter, color_formatter_sql

#############
## LOGGERS ##
#############

[logger_root]
level = NOTSET
handlers = console

[logger_routes]
level = DEBUG
handlers =
qualname = routes.middleware
## "level = DEBUG" logs the route matched and routing variables.
propagate = 1

[logger_beaker]
level = DEBUG
handlers =
qualname = beaker.container
propagate = 1

[logger_templates]
level = INFO
handlers =
qualname = pylons.templating
propagate = 1

[logger_kallithea]
level = DEBUG
handlers =
qualname = kallithea
propagate = 1

[logger_sqlalchemy]
level = INFO
handlers = console_sql
qualname = sqlalchemy.engine
propagate = 0

[logger_whoosh_indexer]
level = DEBUG
handlers =
qualname = whoosh_indexer
propagate = 1

##############
## HANDLERS ##
##############

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = INFO
formatter = generic

[handler_console_sql]
class = StreamHandler
args = (sys.stderr,)
level = WARN
formatter = generic

################
## FORMATTERS ##
################

[formatter_generic]
format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S

[formatter_color_formatter]
class = kallithea.lib.colored_formatter.ColorFormatter
format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S

[formatter_color_formatter_sql]
class = kallithea.lib.colored_formatter.ColorFormatterSql
format = %(asctime)s.%(msecs)03d %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %Y-%m-%d %H:%M:%S



[mail_logo]<http://www.dbc.dk/>

Med venlig hilsen
Lars Skjærlund
DevOps
Tlf.: 44 86 77 77
DBC as

www.dbc.dk<http://www.dbc.dk/>
las at dbc.dk<mailto:las at dbc.dk>

Fra: Mads Kiilerich [mailto:mads at kiilerich.com]
Sendt: 18. november 2015 21:15
Til: Lars Skjærlund <las at dbc.dk>; kallithea-general at sfconservancy.org
Emne: Re: SV: Bug in MySQL code?

On 11/18/2015 09:52 AM, Lars Skjærlund wrote:
Hi Mads,


Ø  When debugging the problem, it might be simpler to let Kallithea create a new database and focus on whether it can store and read unicode.

That’s what I did before reporting the bug: Kallithea does _not_ store data in Unicode format – and if you add new data to the database in Unicode, Kallithea breaks.

If I open the MySQL client and do, say, “update users set lastname = ‘Skjærlund’ where firstname = ‘Lars’”, then Kallithea breaks because the MySQL client adds data in Unicode.

If the above command is to work, I have to type “update users set lastname = ‘Skjærlund’ where firstname = ‘Lars’”. In that case Kallithea survives – and it displays my name correctly in the UI.

I'm just saying that I am very sure it works for other users of Kallithea on MySQL. It must thus be something in your environment.

Especially, I know that it works perfectly for us with PostgreSQL. All databases are handled by sqlalchemy. There might of course be bugs in the database specific parts of sqlalchemy but it is quite unlikely they will go unnoticed.

If you explain what OS you are using and exactly how your mysql and kallithea and web server environment has been set up, someone might spot what is wrong.

/Mads
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151119/2328cfbe/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 2879 bytes
Desc: image001.png
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20151119/2328cfbe/attachment-0001.png>


More information about the kallithea-general mailing list