Issue #141: encoding error with hg repo and umlaut (conservancy/kallithea)

Adi Kriegisch issues-reply at
Thu Jun 25 11:19:09 EDT 2015

New issue 141: encoding error with hg repo and umlaut

Adi Kriegisch:

The error is either triggerable by running 'paster make-index production.ini' or by browsing the files in the repo:


Traceback (most recent call last):
  File "paster", line 9, in <module>
    load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
  File "(...)/lib/python2.7/site-packages/paste/script/", line 104, in run
    invoke(command, command_name, options, args[1:])
  File "(...)/lib/python2.7/site-packages/paste/script/", line 143, in invoke
    exit_code =
  File "(...)/lib/python2.7/site-packages/kallithea/lib/", line 753, in run
    return super(BasePasterCommand, self).run(args[1:])
  File "(...)/lib/python2.7/site-packages/paste/script/", line 238, in run
    result = self.command()
  File "(...)/lib/python2.7/site-packages/kallithea/lib/paster_commands/", line 84, in command
  File "(...)/lib/python2.7/site-packages/kallithea/lib/indexers/", line 451, in run
  File "(...)/lib/python2.7/site-packages/kallithea/lib/indexers/", line 443, in update_indexes
  File "(...)/lib/python2.7/site-packages/kallithea/lib/indexers/", line 390, in update_file_index
    i, iwc = self.add_doc(writer, path, repo, repo_name)
  File "(...)/lib/python2.7/site-packages/kallithea/lib/indexers/", line 175, in add_doc
    node = self.get_node(repo, path, index_rev)
  File "(...)/lib/python2.7/site-packages/kallithea/lib/indexers/", line 163, in get_node
    node = cs.get_node(node_path)
  File "(...)/lib/python2.7/site-packages/kallithea/lib/vcs/backends/hg/", line 352, in get_node
    % (path, self.short_id))
kallithea.lib.vcs.exceptions.NodeDoesNotExistError: There is no file nor directory at the given path: '�berblick_Machbarkeitsstudie.doc' at revision XXX

The filename itself decodes fine with either latin-1 or latin-2:


>>> l=os.listdir(".")
>>> l
['.hg', '\xdcberblick_Machbarkeitsstudie.doc']
>>> print l[1]
>>> chardet.detect(l[1])
{'confidence': 0.8991773543668901, 'encoding': 'ISO-8859-2'}
>>> print l[1].decode('ISO-8859-2')

anything else you need that might help at debugging? 

More information about the kallithea-general mailing list