[PATCH 3 of 3 v2] hooks: fix potentially invalid interpreter in git hooks (Issue #333)

Wed Apr 3 22:41:37 UTC 2019

On 4/3/19 10:28 PM, Thomas De Schampheleire wrote:
> # HG changeset patch
> # User Thomas De Schampheleire <thomas.de_schampheleire at nokia.com>
> # Date 1554323208 -7200
> #      Wed Apr 03 22:26:48 2019 +0200
> # Branch stable
> # Node ID 0d91f710606b72027456a1183794ee38f26fe741
> # Parent  06b511848fb1a3ead56de583e19df5f317bdf7e0
> hooks: fix potentially invalid interpreter in git hooks (Issue #333)

Thanks for iterating on this.

> Commit 5e501b6ee639 introduced the use of 'sys.executable' as interpreter
> for git hooks instead of 'python2' with the following argument:
>
>      "Windows doesn't necessarily have "python2" available in $PATH, but we
>      still want to make sure we don't end up invoking a python3. Using the
>      absolute path seems more safe."
>
> But, sys.executable does not necessarily point to Python. When Kallithea is
> started under uWSGI, sys.executable points to the uwsgi executable. As a
> result, the interpreter encoded in the git hooks on the server repositories
> would be:
>
>      #!/usr/bin/env /path/to/uwsgi
>
> And pushing to such repo would result in following client errors:
>
>      $ git push
>      Password for 'http://user@localhost:5050':
>      Enumerating objects: 3, done.
>      Counting objects: 100% (3/3), done.
>      Writing objects: 100% (3/3), 241 bytes | 241.00 KiB/s, done.
>      Total 3 (delta 0), reused 0 (delta 0)
>      remote: unable to load configuration from hooks/pre-receive
>      To http://localhost:5050/gitrepo-new
>       ! [remote rejected] master -> master (pre-receive hook declined)
>      error: failed to push some refs to 'http://user@localhost:5050/gitrepo-new'

Can we somehow make it show a more useful error message in this case? I 
guess not - none of our code is executed ...

> The approach taken by this commit is:

FWIW, this:

> - Introduce a new configuration setting: hook_python_path, and use its value
>    as the Python interpreter to use in git hooks.
> - If the value is absent or empty, fall back to the current logic
>    'sys.executable' or if that is also invalid, 'python2'.

and this:

> - Let `kallithea-cli config-create` fill in its own interpreter as default
>    value, which should be valid as long as the virtual environment is not
>    moved/replaced.

*could* be separate "milestones" and separate changesets. But no big 
deal. Split or keep together as you like.

> The main downside of this approach is its fragility in case a new virtualenv
> is used. Previously, the configuration file was the same regardless of
> virtualenv, but this is no longer true after this patch.

But also, it is crucial that the *right* virtualenv is used, also if Git 
should be invoked directly, outside any virtual env. It is thus inherent 
that the new virtualenv *has* to be written to the hooks somehow. The 
previous use of same 'python2' was wrong and fragile. I thus don't think 
the new way is more fragile. Explicit expectations and failing early is 
*less* fragile ;-)

> diff --git a/development.ini b/development.ini
> --- a/development.ini
> +++ b/development.ini
> @@ -117,6 +117,16 @@ use_htsts = false
>   ## number of commits stats will parse on each iteration
>   commit_parse_limit = 25
>   
> +## Path to Python executable to be used in git hooks.
> +## When empty, the value of 'sys.executable' at the time of installation
> +## of the git hooks is used, which is correct in many cases but not when
> +## using uwsgi.
> +## When creating a config file via `kallithea-cli config-create`, the
> +## Python executable used for that invocation is filled in below.
> +## If you change this setting, you should reinstall the git hooks via
> +## Admin > Settings > Remap and Rescan.
> +hook_python_path =
> +
>   ## path to git executable
>   git_path = git
>   
> diff --git a/kallithea/lib/inifile.py b/kallithea/lib/inifile.py
> --- a/kallithea/lib/inifile.py
> +++ b/kallithea/lib/inifile.py
> @@ -23,6 +23,7 @@ other custom values.
>   import logging
>   import re
>   import os
> +import sys
>   
>   import mako.template
>   
> @@ -40,6 +41,7 @@ default_variables = {
>       'host': '127.0.0.1',
>       'port': '5000',
>       'uuid': lambda: 'VERY-SECRET',
> +    'python_executable': sys.executable or '',
>   }
>   
>   
> diff --git a/kallithea/lib/paster_commands/template.ini.mako b/kallithea/lib/paster_commands/template.ini.mako
> --- a/kallithea/lib/paster_commands/template.ini.mako
> +++ b/kallithea/lib/paster_commands/template.ini.mako
> @@ -211,6 +211,16 @@ use_htsts = false
>   <%text>## number of commits stats will parse on each iteration</%text>
>   commit_parse_limit = 25
>   
> +<%text>## Path to Python executable to be used in git hooks.</%text>
> +<%text>## When empty, the value of 'sys.executable' at the time of installation</%text>
> +<%text>## of the git hooks is used, which is correct in many cases but not when</%text>
> +<%text>## using uwsgi.</%text>

Perhaps clarify that "used" means "written into the git hook script files".

> +<%text>## When creating a config file via `kallithea-cli config-create`, the</%text>
> +<%text>## Python executable used for that invocation is filled in below.</%text>
> +<%text>## If you change this setting, you should reinstall the git hooks via</%text>
> +<%text>## Admin > Settings > Remap and Rescan.</%text>
> +hook_python_path = ${python_executable}

Perhaps call the variable `python_executable` just like the internal 
name. Consistency makes it simpler to work with. If one name is better 
than the other, just use it everywhere. And while the current use of 
python_executable is to be written into hooks, that "policy" doesn't 
have to be "hardcoded" into the name of the setting - mentioning in the 
comments should be enough.

And how about using template conditionals (and examples in comments, like)

"""
# python_executable = /srv/kallithea/venv/bin/python2
# python_executable = (((a windows example)))
%if python_executable:
python_executable = ${python_executable}
%endif
"""

Then we don't need the previous change or the " or ''" ... and we avoid 
putting trailing space in the .ini ;-)

>   <%text>## path to git executable</%text>
>   git_path = git
>   
> diff --git a/kallithea/model/scm.py b/kallithea/model/scm.py
> --- a/kallithea/model/scm.py
> +++ b/kallithea/model/scm.py
> @@ -721,8 +721,11 @@ class ScmModel(object):
>           return choices, hist_l
>   
>       def _get_python_executable(self):
> -        """Return a Python executable for use in hooks."""
> -        return sys.executable or 'python2'
> +        """Return a Python executable for use in hooks.
> +
> +        This is not simply sys.executable because under uwsgi it will be the
> +        uwsgi program itself."""

I think it also would be very helpful to summarize how virtualenv should 
(and will) be handled.

Perhaps, if you know mod_wsgi has similar problems, mention that too.

Windows vs Posix might also deserve mentioning.

(But also, I don't know how much we want to duplicate information here, 
in .ini, in commit message, (and perhaps in documentation if we can't 
automate it fully but have to explain).)

> +        return kallithea.CONFIG.get('hook_python_path', sys.executable) or 'python2'

We want sys.executable, also if the .ini file sets it to the empty 
string. I guess it thus should be something like

kallithea.CONFIG.get('python_executable') or sys.executable or 'python2'

>       def install_git_hooks(self, repo, force_create=False):
>           """
> diff --git a/scripts/generate-ini.py b/scripts/generate-ini.py
> --- a/scripts/generate-ini.py
> +++ b/scripts/generate-ini.py
> @@ -48,6 +48,7 @@ ini_files = [
>               },
>           },
>           {
> +            'python_executable': '',
>           },

(With the template proposed above, we don't need this.)

/Mads